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To the memory of Harold and Esther 
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Preface to the Second Edition 


This second edition is an expanded and improved version of the first. The last 
two chapters are entirely new. The other chapters have been revised, taking into 
account the comments of many readers. We are particularly grateful to Florin 
Catrina, Jonathan Korman, Andrew Nicas, Carolyn Pitchik, Heydar Radjavi and 
Zack Wolske for their suggestions. 

The preface to the first edition has been rewritten and divided into two prefaces, 
one for readers and one for instructors. 

There are undoubtedly further improvements that could be made. We would 
appreciate your sending any comments, corrections, or suggestions to any of the 
authors at their e-mail addresses given below. 


Daniel Rosenthal: danielkitairosenthal @ gmail.com 
David Rosenthal: rosenthd @stjohns.edu 
Peter Rosenthal: rosent@ math.toronto.edu 


Toronto, ON, Canada Daniel Rosenthal 
Queens, NY, USA David Rosenthal 
Toronto, ON, Canada Peter Rosenthal 


Preface for Readers 


The fundamental purpose of this book is to teach you to understand mathematical 
thinking. We have tried to do that in a way that is clear and engaging, and 
emphasizes the beauty of mathematics. You may be reading this book on your own 
or as a text for a course you are enrolled in. Regardless of your reason for reading 
this book, we hope that you will find it understandable and interesting. 

This book contains a lot of mathematics. We do not expect you to necessarily 
read all of it. In the Preface for Instructors, we describe possible courses that use 
only parts of the book. 

Mathematics is a huge and growing body of knowledge; no one can learn more 
than a fraction of it. But the central thing to learn is how to think mathematically. 
It is our experience that mathematical thinking can be learned by almost anyone 
who is willing to make a serious attempt. We invite you to make such an attempt by 
reading at least part of this book. It is important not to let yourself be discouraged if 
you can’t easily understand something. Everyone learning mathematics finds some 
concepts baffling at first, but usually, with enough effort, the ideas become clear. 

One way in which mathematics gets very complex is by building on itself; some 
mathematical concepts are built on a foundation of many other concepts and thus 
require a great deal of background to understand. That is not the case for the topics 
discussed in this book. Reading this book does not require any background other 
than basic high school algebra and, for parts of Chapters 9 and 12, some high school 
trigonometry. 

A few questions, among the many, that you will easily be able to answer after 
reading the relevant parts of this book are the following: Is 137!7 . 379 . 41/5 = 
19!!! . 29145 . 43!? . 475 (see Chapter 4)? Is there a largest prime number (i.e., a 
largest whole number whose only factors are | and itself) (Theorem 1.1.5)? If a 
store sells one kind of product for 9 dollars each and another kind for 16 dollars 
each and receives 143 dollars for the total sale of both, how many products did the 
store sell at each price (Example 7.2.7)? How do computers send secret messages 
to each other (Chapter 6)? How is the size of an infinite set defined? Are there more 
fractions than there are whole numbers? Are there more real numbers than there are 
fractions? Is there a smallest infinity? Is there a largest infinity (Chapter 10)? What 
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are complex numbers (Chapter 9)? Is .3333.... really equal to 5 (Example 13.2.8)? 
What are some infinite-dimensional spaces (Section 14.5)? 

The hardest theorem proven in this book concerns the construction of angles 
using a compass and a straightedge. (A straightedge is a ruler-like device but without 
measurements marked on it. Straightedges are used to draw lines connecting two 
points.) If you are given any angle, it is easy to bisect it (i.e., divide it into two 
equal subangles) by using a compass and a straightedge (we will show you how to 
do that). This and many similar results were discovered by the Ancient Greeks. The 
Ancient Greeks wondered whether angles could be “trisected” in the sense of being 
divided into three equal subangles using only a straightedge and a compass. A lot of 
mathematics beyond that conceived of by the Ancient Greeks was required to solve 
this problem; it was not solved until the nineteenth century. It can be proven that 
many angles, including an angle of 60 degrees, cannot be so trisected. We present a 
complete proof of this as an illustration of complicated but beautiful mathematical 
reasoning. 

The most important question you'll be able to answer after reading at least several 
chapters of this book, although you will have difficulty formulating the answer in 
words, is: what is mathematical thinking really like? If you read and understand 
several chapters and do a fair number of the problems that are provided, you will 
certainly have a feeling for mathematical thinking. 

We hope that you read this book carefully. Reading mathematics is not like 
reading a novel, a newspaper, or anything else. As you go along, you have to 
really reflect on the mathematical reasoning that is being presented. After reading a 
description of an idea, think about it. When reading mathematics you should always 
have a pencil and paper at hand to rework what you read. 

The essence of mathematics consists of theorems, which are statements proven 
to be true. We will prove a number of theorems. When you begin reading about a 
theorem, think about why it may be true before you read our proof. In fact, at some 
points you may be able to prove the theorem we state without looking at our proof 
at all. In any event, you should make at least a small attempt before reading the 
proof in the book. It is often useful to continue such attempts while in the middle of 
reading the proof that we present; once we have gotten you a certain way towards 
the result, see if you can continue on your own. 

If you adopt such an approach and are patient, we believe that you will learn 
to think mathematically. We are also convinced that you will feel that much of the 
mathematics that you learn is beautiful, in the sense that you will find that the logical 
argument that establishes the theorem is what mathematicians call “elegant.” 

We chose the material for this book based on the following criteria: the 
mathematics is beautiful, it is useful in many mathematical contexts, and it is 
accessible without much mathematical background. The theorems that we prove 
have applications to mathematics and to problems in other subjects. 

Each chapter ends with a section entitled “Problems.” The “Problems” sections 
are divided into three subsections. You should do some of the “Basic Exercises” 
to ensure that you have an understanding of the fundamentals of the chapter. 
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The subsections entitled “Interesting Problems” contain problems whose solutions 
depend upon the material of the chapter and seem to have mathematical or other 
interest. The subsections labeled “Challenging Problems” contain problems that we 
expect you will, indeed, find to be quite challenging. You should not be discouraged 
if you cannot solve some of the problems. However, if you do solve problems that 
you find difficult at first, especially those that we have labeled “challenging,” we 
hope and expect that you will experience some of the pleasure and satisfaction that 
mathematicians feel upon discovering new mathematics. 

Each chapter is divided into sections. Important items, such as definitions and 
theorems, are numbered in a way that locates them within a chapter and a section 
of that chapter. We put the chapter number, then the section number, and then the 
number of the item within that section. For example, 7.2.4 refers to the fourth 
numbered item in section two of chapter seven. 

Readers who wish to omit some of the material (perhaps only at first) should be 
aware of the following. Chapters 1, 2, 4, and 8 may be read without reading any 
other parts of this book. Chapter 5 depends on Chapters 3 and 4, and Chapter 6 
requires Chapter 5. Some of the examples in Chapter 7 depend on Chapter 6; 
the rest of the chapter is independent of Chapter 6. Chapter 8 uses Chapter 4. 
Chapters 9, 10, and 11 are essentially independent of each other and of all other 
chapters. Chapter 12 basically depends only on Chapter 11 and on the concepts of 
rational and irrational numbers as discussed in Chapter 8. Chapter 13 can be read 
independently of the other chapters, and Chapter 14 does not require any of the 
previous material except for the essential properties of convergence of infinite series, 
as discussed in Chapter 13. 
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A glance at the table of contents and the Preface for Readers will give you an idea 
of the material covered in this text. 
Some features of this book include: 


¢ Complete proofs that an angle of 60 degrees cannot be trisected with a straight- 
edge and compass (Corollary 12.3.24) and that an angle of an integral number n 
degrees can be constructed with a straightedge and compass if and only ifn is a 
multiple of 3 (Theorem 12.4.13). 

¢ A thorough discussion of the Principle of Mathematical Induction (Chapter 2). 

¢ A chapter that provides an introduction to Euclidean plane geometry (Chap- 
ter 11). 

¢ A complete description of RSA encryption (7.2.5). 

¢ A fairly extensive treatment of cardinality (Chapter 10). 

e An introduction to infinite-dimensional spaces (Chapter 14). 

¢ Using the least upper bound property to establish theorems about convergence of 
infinite series (Section 13.4). 

¢ Showing that real numbers can be represented by infinite decimals (Section 13.6). 

e A proof that the infinite series consisting of the reciprocals of the prime numbers 
diverges (Theorem 13.7.8). 


Since the only prerequisite for understanding this book is high school algebra, 
it is suitable as a textbook for a wide variety of courses. In particular, it is our 
view that appropriate parts of the text could be used for courses for mathematics 
or science majors, for courses for other students who want to get an appreciation 
of mathematics, and for courses for prospective teachers. The book is also written 
so as to be useable for independent study by anyone who is interested in learning 
mathematics. In particular, mathematically inclined high school students might be 
directed to this book. 
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The main purpose of this book is to teach mathematical thinking. Some 
instructors like to begin such a course by discussing basic logic and different kinds 
of proofs. Others prefer to present some interesting mathematics simply and clearly, 
with the expectation that students will learn to think mathematically by being gently 
exposed to the mathematics presented. 

We are in the latter camp. 

The text begins with a basic introduction to the natural numbers. This is followed 
by a chapter that contains a thorough discussion of mathematical induction. A 
student who has learned to understand most of the material in that chapter will 
have obtained some appreciation of mathematical thinking. Learning the material in 
other parts of the book will deepen the student’s understanding and will also teach 
the student a lot of interesting mathematics. The textbook provides the opportunity 
for you to choose from a variety of mathematical topics. 

The following are some descriptions of different courses for which part or all of 
this book could serve as a text. There are many other variants that instructors could 
devise. 

A course covering most of the book would take two semesters. Such a course 
would be suitable for students majoring in mathematics, statistics, computer science, 
or physics. 

On the other hand, there are several different one-semester courses that could be 
based on parts of the book. Instructors can vary the level of these courses by the 
pace at which they proceed, the difficulty of the problems that they assign, and the 
material they omit. 

One natural possibility would be to begin at page 1, proceed at whatever pace is 
comfortable for you and your students, and then see where you end up. 

Other possibilities involve omitting some of the chapters. It is our opinion that 
Chapters 1, 2, 4, and 8 should be part of most courses using this book. Additional 
chapters can be chosen based on the needs of the students, the interests of the 
instructor, and the time available. For example, Chapters 3, 5, 6, and 7 could 
be added (i.e., so that the course covers Chapters | through 8). Alternatively, 
Chapters 10 and 13 and/or 14 might be included. 

A course containing a proof that some angles cannot be trisected with straight- 
edge and compass could be based on Chapters 1, 2, 4, 8, 11, and 12. Other chapters 
could be added if time permits. 

Fairly leisurely “mathematics appreciation” courses could cover Chap- 
ters 1, 2,3, 4, and 5, or Chapters 1, 2, 4, 8, and 10, or Chapters 1, 2, 4, 8, and 13, or 
Chapters 1, 2, 4, 8, and the part of Chapter 14 before Definition 14.5.3. 

A one-semester course for prospective or actual teachers of high school math- 
ematics could cover Chapters 1, 2, 4, 8, 11, and 12. It is our view that Chapter 12 
should be of substantial interest to teachers. If they are already familiar with the 
fundamentals of Euclidean geometry as presented in Chapter | 1, there would likely 
be time to add one or more of Chapters 9, 10, 13, or 14. If the instructor does not 
wish to present Chapter 12, a good course for teachers could be based on Chapters 1 
through 8. 
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Lectures on parts of some of the chapters can give students a taste of the 
topic. For example, a very brief introduction to cardinality could consist of the 
part of Chapter 10 up through Theorem 10.2.3. Chapter 14 describes some finite 
and infinite-dimensional spaces. The part before Definition 14.5.3 is completely 
independent of the rest of the book; the balance requires the concept of convergence 
of series. 

Chapter 11 is an essentially self-contained introduction to geometry. 

Chapter 13, which does not significantly rely on any other chapters, is intended 
to provide a first introduction to concepts of analysis by explaining convergence of 
infinite series in a direct manner. The idea of “adding lots of terms to get close to 
the sum” has some intuitive appeal. In our experience, many students who are taught 
the traditional approach are confused by the distinction between convergence of the 
sequence of terms and convergence of the sequence of partial sums. That is why we 
do not discuss convergence of any sequences other than sequences of partial sums. 
Also, we do not use “sigma notation” within the chapter since some students find it 
to be a barrier to understanding. We define least upper bounds and use that concept 
to rigorously prove the fundamental theorems about convergence. The connections 
to the more standard approaches are established in the last problem of the chapter. 

Using a text that contains more than will be covered in the course you are 
teaching provides an opportunity to encourage interested students to do some 
reading on their own, before or after the course ends. 
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Chapter 1 ®) 
Introduction to the Natural Numbers om 


We assume basic knowledge about the numbers that we count with; that is, the 
numbers 1, 2, 3, 4, 5, 6, and so on. These are called the natural numbers, and 
the collection of all of them is usually denoted by N. They do seem to be very 
natural, in the sense that they arose very early on in virtually all societies. There 
are many other names for these numbers, such as the positive integers and the 
positive whole numbers. Although the natural numbers are very familiar, we will 
see that they have many interesting properties beyond the obvious ones. Moreover, 
there are many questions about the natural numbers to which nobody knows the 
answer. Some of these questions can be stated very simply, as we shall see, although 
their solutions have eluded the thousands of mathematicians who have attempted to 
solve them. 

We assume familiarity with the two basic operations on the natural numbers, 
addition and multiplication. The sum of two numbers will be indicated using the 
plus sign “+”. Multiplication will be indicated by putting a dot in the middle of the 
line between the numbers, or by simply writing the symbols for the numbers next to 
each other, or sometimes by enclosing them in parentheses. For example, the product 
of 3 and 2 could be denoted 3 - 2 or (3)(2). The product of the natural numbers 
represented by the symbols m and n could be denoted mn, or m - n, or (m)(n). 

We also, of course, need the number 0. Moreover, we require the negative whole 
numbers as well. For each natural number n there is a corresponding negative num- 
ber —n such that n + (—n) = 0. Altogether, the collection of positive and negative 
whole numbers and 0 is called the integers. It is often denoted by Z. 

We assume that you know how to add two negative integers and also how to 
add a negative integer to a positive integer. Multiplication appears to be a bit 
more mysterious. Most people feel comfortable with the fact that, for m and n 
natural numbers, the product of m and (—n) is —mn. What some people find more 
mysterious is the fact that (—m)(—n) = mn for natural numbers m and n; that is, 
the product of two negative integers is a positive integer. There are various possible 
explanations that can be provided for this, one of which is the following. Using the 
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2 1 Introduction to the Natural Numbers 
usual rules of arithmetic: 

0 = (—m)(0) = (—m)(—n +n) = (—m)(—n) + (—m)(n) 
Adding mn to both sides of this equation gives 


0+ mn = (—m)(—n) + (—m)(n) + mn 


or 
mn = (—m)(—n) + ((—m) +m) +n 
Thus, 
mn = (—m)(—n) +0-n 
so 


mn = (—m)(—n) 


Therefore, the fact that (—m)(—n) = mn is implied by the other standard rules of 
arithmetic. 


1.1 Prime Numbers 


One of the important concepts we will study is divisibility. For example, 12 is 
divisible by 3, which means that there is a natural number (in this case, 4) such 
that the product of 3 and that natural number is 12. That is, 12 = 3 - 4. In general, 
we say: 


Definition 1.1.1. The integer m is divisible by the integer n if there exists an integer 
q such that m = nq. 


There are many other terms that are used to describe such a relationship. For 
example, if m = nq, we may say that n and q are divisors of m and that each of n 
and q divides m. Note that every integer divides 0, since 0 = n - 0 for every integer 
n. The terminology “g is the quotient when m is divided by n” is also used when 
n is different from 0. In this situation, n and qg are also sometimes called factors of 
m; the process of writing an integer as a product of two or more integers is called 
factoring the integer. 

The number | is a divisor of every natural number since, for each natural number 
m,m = 1-m. Also, every natural number m is a divisor of itself, since m = m - 1. 


1.1 Prime Numbers 3 


The number | is the only natural number that has only one natural number 
divisor, namely itself. Every other natural number has at least two divisors, itself 
and 1. 


Definition 1.1.2. A prime number is a natural number greater than 1 whose only 
natural number divisors are 1 and the number itself. 


The first prime number is 2. The primes continue: 3, 5, 7, 11, 13, 17, 19, 23, 29, 
31, and so on. 

And so on? Is there a largest prime? Or does the sequence of primes continue 
without end? There is, of course, no largest natural number. For if 1 is any natural 
number, then n + | is a natural number and n + 1| is bigger than n. It is not so easy 
to determine if there is a largest prime number or not. If p is a prime, then p + | is 
almost never a prime. If p = 2, then p + 1 = 3 and p and p + 1 are both primes. 
However, 2 is the only prime number p for which p+ | is prime. This can be proven 
as follows. First note that, since every even number is divisible by 2, 2 itself is the 
only even prime number. Therefore, if p is a prime other than 2, then p is odd and 
p + 1is an even number larger than 2 and is thus not prime. 

Is it nonetheless true that, given any prime number p, there is a prime number 
larger than p? Although we cannot get a larger prime by simply adding | to a given 
prime, there may be some other way of establishing that there is a prime number 
larger than any given one. We will answer this question after learning a little more 
about primes. 

A natural number, other than 1, that is not prime is said to be composite. 
(The number | is special and is neither prime nor composite.) For example, 4, 68, 
129, and 2010 are composites. Thus, a composite number is a natural number that 
has a divisor in addition to itself and 1. 

To determine if a number is prime, what potential factors must be checked 
to eliminate the possibility that there are factors other than the number and 1? 
Fortunately, to check whether or not a natural number m is prime, you need not 
check whether every natural number less than m divides m. 


Theorem 1.1.3. Let m be a natural number other than 1. If m does not have a 
natural number divisor that is greater than 1 and no larger than the square root of 
m, then it is prime. 


Proof. If m =n - q, it is not possible that n and g are both larger than the square 
root of m, for if two natural numbers are both larger than the square root of m, then 
their product is larger than m. It follows that a natural number greater than | that is 
not prime has at least one divisor that is larger than | and is no larger than the square 
root of that natural number. Oo 


For example, we can conclude that 101 is prime since none of the numbers 2, 3, 
4,5, 6, 7, 8, 9, 10 are divisors of 101. 

Using sophisticated techniques and computers, many very large numbers have 
been shown to be prime. For example, 100,000,561 is prime, as is 22,801,763,489. 
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The fact that very large natural numbers have been shown to be prime does not 
answer the question of whether there is a largest prime. The theorem that there is 
always a prime larger than p for every prime number p cannot be established by 
computing any number of specific primes, no matter how large. 

Over the centuries, mathematicians have discovered many proofs that there is 
no largest prime. We shall present one of the simplest and most beautiful proofs, 
discovered by the Ancient Greeks. 

We begin by establishing a preliminary fact that is required for the proof. 
A statement that is proven for the purpose of being used to prove something else 
is called a “lemma.” We need a lemma. The lemma that we require states that every 
composite number has a divisor that is a prime number. (The proof that we present 
of the lemma is quite convincing, but we shall subsequently present a more precise 
proof; see Lemma 2.2.3.) 


Lemma 1.1.4. Every natural number greater than I has a prime divisor. 


Proof. If the given natural number is prime, then it is a prime divisor of itself. If the 
number, say m, is composite, then m has at least one factorization m = n - q where 
neither n nor g is m or 1. If either of 1 or g is a prime number, then the lemma is 
established for m. If n is not prime, then it has a factorization n = s - t, where s 
and ¢ are natural numbers other than 1 and n. It is clear that s and ¢ are also divisors 
of m. Thus, if either of s and ¢ is a prime number, the lemma is established. If s is 
not prime, then it can be factored into a product where neither factor is s or 1, and 
so on. Continued factoring must get down to a factor that cannot itself be factored; 
i.e., to a factor that is prime. That prime number is a divisor of m, so the lemma is 
established. Oo 


The following is the ingenious proof of the infinitude of the primes discovered 
by the Ancient Greeks. 


Theorem 1.1.5. There is no largest prime number. 


Proof. Let p be any prime number. We must prove that there is some prime larger 
than p. To do this, we will construct a number that we will show is either a prime 
larger than p or has a prime divisor larger than p. In both cases, we will conclude 
that there is a prime number larger than p. 

Here is how we construct the large number. Let M be the number obtained by 
taking the product of all the prime numbers up to and including the given prime p 
and then adding 1| to that product. That is, 


M = (2-3-5-7-11-13-17-19---p)+1 
It is possible that M is a prime number. If that is so, then there is a prime number 


larger than p, since M is obviously larger than p. If M is not prime, then it is 
composite. We must show that there is a prime larger than p in this case as well. 
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Suppose, then, that M is composite. By Lemma 1.1.4, it follows that M has a 
prime divisor. Let g be any prime divisor of M. We will show that g is larger than 
p and thus that there is a prime larger than p in this case as well. 

Consider possible values of qg, a prime divisor of M. Surely gq is not 2, for 


2-3-5-7-11-13-17-19---p 


is an even number, and thus adding | to that number to get M produces an odd 
number. That is, M is odd and is therefore not divisible by 2. Since g does divide 
M, q cannot be equal to 2. 

Similar reasoning shows that g cannot be 3. For 


2-3-5-7-11-13-17-19---p 


is a multiple of 3, so the number obtained by adding 1, namely M, leaves a 
remainder of | when it is divided by 3. That is, 3 is not a divisor of M. Since g 
is a divisor of M, q is not 3. 

Exactly the same proof shows that g is not 5, since 5 is a divisor of 


2-3-5-7-11-13-17-19---p 


and thus cannot be a divisor of M. In fact, the same proof establishes that g cannot 
be any of the factors 2,3,5,..., p of the product 


2-3-5-7-11-13-17-19---p 


Since every prime number up to and including p is a factor of that product, g cannot 
be any of those prime numbers. Therefore g is a prime number that is not any of the 
prime numbers up to and including p. It follows that g is a prime number larger than 
p, and we have proven that there is a prime number larger than p in the case where 
M is composite. Therefore, in both cases, the case where M is prime and the case 
where M is composite, we have shown that there is a prime number larger than p. 
This proves the theorem. Oo 


Every mathematician would agree that the above proof is “elegant.” If you find 
the proof interesting, then you are likely to appreciate many of the other ideas that 
we will discuss (and much mathematics that we do not cover as well). 


1.2. Unanswered Questions 


There are many questions concerning prime numbers that no one has been able 
to answer. One famous question concerns what are called twin primes. Since 2 
is the only even prime number, the only consecutive integers that are prime are 
2 and 3. There are, however, many pairs of primes that are two apart, such as 
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{3, 5}, {29, 31}, {101, 103}, {1931, 1933}, and {104471, 104473}. Such pairs are 
called twin primes. One question that remains unanswered, in spite of the efforts 
of thousands of mathematicians over hundreds of years, is the question of whether 
there is a largest pair of twin primes. Some very large pairs are known (e.g., 
{ 1000000007, 1000000009} and many pairs that are even much bigger), but no one 
knows if there is a largest such. 

Another very famous unsolved problem is whether or not the Goldbach Con- 
jecture is true. Several hundred years ago, Goldbach conjectured (that is, said 
that he thought that it was probably true) that every even natural number larger 
than 2 is the sum of two prime numbers (e.g.,6 = 3 + 3, 20 = 7+ 13, and 
22,901,764,050 = 22,801,763,489 + 100,000,561). Goldbach’s Conjecture is 
known to be true for many very large even natural numbers, but no one has been 
able to prove it in general (or to show that there is an even number that cannot be 
written as the sum of two primes). 

If you are able to solve the twin primes problem or determine the truth or falsity 
of Goldbach’s Conjecture, you will immediately become famous throughout the 
world and your name will remain famous as long as civilization endures. On the 
other hand, it will almost undoubtedly prove to be extremely difficult to answer 
either of those questions. On the other “other hand,” there is a very slight possibility 
that one or both of those questions have a fairly simple answer that has been 
overlooked by the many great and not-so-great mathematicians who have thought 
about them. In spite of the very small possibility of success, you might find it 
interesting to think about these problems. 


1.3 Problems 


Basic Exercises 


1. Show that the following are composite numbers: 
(a) 68 
(b) 129 
(c) 20,101,116 
2. Which of the following are prime numbers? 
(a) 79 
(b) 153 
(c) 537 
(d) 851,486 


3. Write each of the following numbers as a sum of two primes: 


(a) 100 
(b) 112 
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Interesting Problems 


4. Verify that the Goldbach Conjecture holds for all even numbers between 4 and 


5. 


50. 
Find a pair of twin primes such that each prime is greater than 200. 


Challenging Problems 


10. 


. Find a prime number p such that the number (2-3-5-7--- p)-+ 1 is not prime. 
. Suppose that p, p + 2, and p + 4 are prime numbers. Prove that p = 3. 


{Hint: Why can’t p be 5 or 7?] 


. Prove that, for every natural number 7 greater than 2, there is a prime number 


between n and n!. (Recall that, for every natural number n, 7!, which is read “n 
factorial”, denotes the product of all the natural numbers from | up to 7.) 
[{Hint: There is a prime number that divides n! — 1.] 

Note that this gives an alternate proof that there are infinitely many prime 
numbers. 


. Prove that, for every natural number n, there are n consecutive composite 


numbers. 

{Hint: (n + 1)! + 2 is a composite number. ] 

Show that a natural number has an odd number of different factors if and only 
if it is a perfect square (i.e., it is the square of another natural number). 


Chapter 2 ®) 
Mathematical Induction ml 


There is a method for proving certain theorems that is called mathematical 
induction. We will give a number of examples of proofs that use this method. 
The basis for mathematical induction, however, is a statement about sets of natural 
numbers. We use the word set informally to mean any collection of things, and each 
“thing” is said to be an element of the set. (For more on sets, see Chapter 10.) Recall 
that the set of all natural numbers is the set {1, 2, 3,...}. Mathematical induction 
provides an alternate description of that set. 


2.1 The Principle of Mathematical Induction 


The way mathematical induction is usually explained can be illustrated by consider- 
ing the following example. Suppose that we wish to prove, for every natural number 
n, the validity of the following formula for the sum of the first n natural numbers: 


n(n + 1) 


1424+3+---+@-tn=—; 


One way to prove that this formula holds for every n is the following. First, the 
formula does hold for n = 1, for in this case the left-hand side is just 1 and the 
right-hand side is ee) which is equal to 1. To prove that the formula holds for all 
n, we will establish the fact that whenever the formula holds for any given natural 
number, the formula will also hold for the next natural number. That is, we will 
prove that the formula holds forn = k + 1 whenever it holds for n = k. (This 
passage from k to k + | is often called “the inductive step.”) If we prove this fact, 
then, since we know that the formula does hold for n = 1, it would follow from this 
fact that it holds for the next natural number, 2. Then, since it holds for n = 2, it 
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holds for the natural number that follows 2, which is 3. Since it holds for 3, it holds 
for 4, and then for 5, and 6, and so on. Thus, we will conclude that the formula holds 
for every natural number. 

To prove the formula in general, then, we must show that the formula holds for 
n = k +1 whenever it holds for n = k. Assume that the formula does hold for 
n = k, where k is any fixed natural number. That is, we assume the formula 


k(kK+1 
iia 


We want to derive the formula for n = k + 1 from the above equation. We do that 
as follows. Assuming the above formula, add k + | to both sides, getting 


k(k +1) 
2 


L$243 40-4 D4EEEELD= + (k+1) 


We shall see that a little algebraic manipulation of the right-hand side of the above 
will produce the formula for n = k + 1. To see this, note that 


k(k +1) kk+1)  2(k+1) 
tk = tS 
k(K+1)+2(kK +1) 
7 2 
(k +2)(k +1) 
= 
(k + D(k +2) 
7 2 
— k+D(K+) +1) 
5 


Thus, 1+2+34+---+(k—-I+k+ 41) = GVCD*D This equation is 
the same as that obtained from the formula by substituting k + 1 for n. Therefore 
we have established the inductive step, so we conclude that the formula does hold 
for all n. 

The Principle of Mathematical Induction, which is implicitly used in the above 
proof, is really just an assertion about sets of natural numbers. 

Suppose S is a set of natural numbers that has the following two properties: 


A. The number J is in S. 
B. Whenever a natural number is in S, the next natural number is also in S. 


The second property can be stated a little more formally: If k is a natural number 
and k isin S, thenk + 1 isin S. 

What can we say about a set S that has those two properties? Since | is in S (by 
property A), it follows from property B that 2 is in S. Since 2 is in S, it follows from 
property B that 3 is in S. Since 3 is in S, 4 is in S. Then 5 is in S, 6 is in S, 7 is in 
S, and so on. It seems clear that S must contain every natural number. That is, the 
only set of natural numbers with the above two properties is the set of all natural 
numbers. We state this formally: 
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The Principle of Mathematical Induction 2.1.1. [f S is any set of natural num- 
bers with the properties that 


A. lisin S, and 
B. k +1 isin S whenever k is any number in S, 


then S is the set of all natural numbers. 


In introducing the Principle of Mathematical Induction, we gave an indication of 
why it is true. A more formal proof can be based on the following more obvious 
fact, which we assume as an axiom. 


The Well-Ordering Principle 2.1.2. Every set of natural numbers that contains at 
least one element has a smallest element in it. 


We can establish the Principle of Mathematical Induction from the Well-Ordering 
Principle as follows. Suppose that the Well-Ordering Principle holds for all sets 
of natural numbers. Let S be any set of natural numbers and suppose that S$ 
has properties A and B of the Principle of Mathematical Induction. To prove the 
Principle of Mathematical Induction, we must prove that the only such set S is the 
set of all natural numbers. We will do this by showing that it is impossible that there 
is any natural number that is not in S. To see this, suppose that S does not contain all 
natural numbers. Then let T denote the set of all natural numbers that are not in S. 
Assuming that S is not the set of all natural numbers is equivalent to assuming that 
T has at least one element. If this were the case, then well-ordering would imply 
that T has a smallest element. We will show that this is impossible. 

Suppose that ¢ was the smallest element of T. Since 1 is in S, 1 is not in T. 
Therefore, ¢ is larger than 1. It follows that ¢ — 1 is a natural number. Since t — 1 
is less than the smallest number ¢ in T, tf — 1 cannot be in T. Since T contains all 
the natural numbers that are not in S, this implies that t — 1 is in S. This, however, 
leads to the following contradiction. Since S has property B, (t — 1) + 1 must also 
be in S. But this is ¢, which is in T and therefore not in S. Therefore the assumption 
that there is a smallest element of T is not consistent with the properties of S. Thus, 
there is no smallest element of T and, by well-ordering, there is therefore no element 
in T. This proves that S is the set of all natural numbers. 

The way the formal principle applies to examples such as the one given above 
is by letting the set S be the set of natural numbers for which the formula holds. 
Showing that S is the set of all natural numbers is equivalent to showing that the 
formula holds for every natural number. 

There are many very similar proofs of similar formulas. 


Theorem 2.1.3. For every natural number n, 


IQ 1 
i 
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Proof. Let S be the set of all natural numbers for which the formula is true. We 
want to show that S contains all of the natural numbers. We do this by showing that 
S has properties A and B. 

For property A, we need to check that 17 = MeADeh This is true, so S 
satisfies property A. To verify property B, let k be in S. We must show that k + 1 is 
in S. Since k is in S, the formula holds for k. That is, 


__ kk + DOK +1) 


Using this formula, we can prove the corresponding formula for k + 1 as follows. 
Adding (k + 1)? to both sides of the above equation, we get 


K(k + DQk + 1) 


k+1)? 
6 + (k +1) 


72427437 4-.-4R+(k+1)? = 


Now we do some algebraic manipulations to the right-hand side to see that it is what 
we want: 


k(k+ ue +1) ie, ee k(k + 1)(2k = + 6(k + 1)? 
(k + 1) (kk +1) + 6(k+ )) 
6 
(k+ D(eW +k) + (6k+ 6)) 
6 
_ &+ DQ + 7k +6) 
6 


_ (K+ Ik + 2)(2k + 3) 
7 6 


The last equation is the formula in the case whenn = k+ 1, sok+ 1 isin S. 
Therefore, S is the set of all natural numbers by the Principle of Mathematical 
Induction. Oo 


Sometimes one wants to prove something by induction that is not true for all 
natural numbers, but only for those bigger than a given natural number. A slightly 
more general principle that can be used in such situations is the following. 


The Generalized Principle of Mathematical Induction 2.1.4. Let m be a natural 
number. If S is a set of natural numbers with the properties that 


A. mis in S, and 
B. k +1 isin S whenever k is in S and is greater than or equal to m, 


then S contains every natural number greater than or equal to m. 
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The Principle of Mathematical Induction is the special case of the generalized 
principle when m = 1. The generalized principle states that we can use induction 
starting at any natural number, not just at 1. 

For example, consider the question: which is larger, n! or 2”? (Recall that n! is 
the product of the natural numbers from | up ton.) For n = 1, 2, and 3, we see that 


W=1<2!'=2 


Nao2T]227 —]2-2=4 
31=3-2-1=6<2 =2.2.2=8 
But when n = 4, the inequality is reversed, since 
AS43499 GS 04 Ss Of a 940 92 = 16 


When n = 5, 
51=5-4-3-2-1=120>2°=2-2-2-2-2= 32 


If you think about it a bit, it is clear why eventually n! is much bigger than 2”. 
In both expressions we are multiplying n numbers together, but for 2” we are always 
multiplying by 2, whereas the numbers we multiply to build n! get larger and larger. 
While it is not true that n! > 2” for every natural number (since it is not true when 
nis 1, 2, or 3), we can, as we now show, use the more general form of mathematical 
induction to prove that it is true for all natural numbers greater than or equal to 4. 


Theorem 2.1.5. n! > 2” forn > 4. 


Proof. We use the Generalized Principle of Mathematical Induction with m = 4. 
Let S be the set of natural numbers for which the theorem is true. As we saw above, 
4! > 2+. Therefore, 4 is in S. Thus, property A is satisfied. For property B, assume 
that k > 4 and that k is in S; i.e., k! > 2*. We must show that (k + 1)! > 24+1, 
Multiplying both sides of the inequality for k (which we have assumed to be true) 
by k + 1 gives 


(k+ 1)(k) > (K+ 1) - 24 
The left-hand side is just (k + 1)!; therefore we have the inequality 
(K+ 1)! > (k+1)-2* 


Since k > 4,k +1 > 2. Therefore, the right-hand side of the inequality, (k + 1) -2*, 
is greater than 2 - 2 = 2‘+!, Combining this with the above inequality, we get 


CEI SKE) HR So 


14 2 Mathematical Induction 


Thus, k + | is in S, which verifies property B. By the Generalized Principle of 
Mathematical Induction, S$ contains all natural numbers n > 4. oO 


The following is an example where mathematical induction is useful in establish- 
ing a geometric result. We will use the word “tromino” to denote an L-shaped object 
consisting of three squares of the same size. That is, a tromino looks like this: 


Another way to think of a tromino is that it is the geometric figure obtained by taking 
a square that is composed of four smaller squares and removing one of the smaller 
squares. 

We are going to consider what geometric regions can be covered by trominos, 
all of which have the same size and do not overlap each other. As a first example, 
start with a square made up of 16 smaller squares (i.e., a square that is “4 by 4”) and 
remove one small square from a corner of the square: 


Can the region that is left be covered by trominos (each made up of three small 
squares of the same size as the small squares in the region) that do not overlap each 
other? It can: 


We can use mathematical induction to prove the following. 


Theorem 2.1.6. For each natural number n, consider a square consisting of 27" 
smaller squares. (That is, a2” x 2” square.) If one of the smaller squares is removed 
from a corner of the large square, then the resulting region can be completely 
covered by trominos (each made up of three small squares of the same size as the 
small squares in the region) in such a way that the trominos do not overlap. 


Proof. To begin a proof by mathematical induction, first note that the theorem is 
certainly true for n = 1; the region obtained after removing a small corner square is 
a tromino, so it can be covered by one tromino. 

Suppose that the theorem is true for n = k. That is, we are supposing that if a 
small corner square is removed from any 2 x 2 square consisting of 27 smaller 
squares, then the resulting region can be covered by trominos. The proof will be 
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established by the Principle of Mathematical Induction if we can show that the same 
result holds for n = k + 1. Consider, then, any 2‘+! x 2+! square consisting of 
smaller squares. Remove one corner square to get a region that looks like this: 


The region can be divided into four “medium-sized” squares, three of which are 
2k x 2* and one of which is 2‘ x 2* with one corner removed, like this: 
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Now place a tromino in the middle of the region, as illustrated below. 


Because of the tromino in the middle, the four “medium-sized” squares remain- 
ing to be covered each have one corner covered or missing. By the inductive 
hypothesis, trominos can be used to cover the rest of each of the four “medium- 
sized” squares. This leads to a covering of the entire 2+! x 2*+! square, thus 
finishing the proof by mathematical induction. Oo 


2.2 The Principle of Complete Mathematical Induction 


There is a variant of the Principle of Mathematical Induction that is sometimes very 
useful. The basis for this variant is a slightly different characterization of the set of 
all natural numbers. 


The Principle of Complete Mathematical Induction 2.2.1. (Sometimes called 
the “Principle of Strong Mathematical Induction.”) If S is any set of natural 
numbers with the properties that 


A. lisin S, and 
B. k+1isin S whenever k is a natural number and all of the natural numbers from 
I through k are in S, 


then S is the set of all natural numbers. 
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The informal and formal proofs of the Principle of Complete Mathematical 
Induction are virtually the same as the proofs of the Principle of (ordinary) 
Mathematical Induction. First consider the informal proof. If S is any set of natural 
numbers with properties A and B of the Principle of Complete Mathematical 
Induction, then, in particular, 1 is in S. Since | is in S, it follows from property B 
that 2 is in S. Since | and 2 are in S, it follows from property B that 3 is in S. 
Since 1, 2, and 3 are in S, 4 is in S, and so on. It is suggested that you write out the 
details of the formal proof of the Principle of Complete Mathematical Induction as 
a consequence of the Well-Ordering Principle. 

Just as for ordinary induction, the Principle of Complete Mathematical Induction 
can be generalized to begin at any natural number, not just 1. 


The Generalized Principle of Complete Mathematical Induction 2.2.2. Let m 
be a natural number. If S is any set of natural numbers with the properties that 


A. mis in S, and 
B. k + 1 isin S whenever k is a natural number greater than or equal to m and all 
of the natural numbers from m through k are in S, 


then § contains all natural numbers greater than or equal to m. 


There are many situations in which it is difficult to directly apply the Principle of 
Mathematical Induction but easy to apply the Principle of Complete Mathematical 
Induction. One example of such a situation is a very precise proof of the lemma 
(Lemma 1.1.4) that was required to prove that there is no largest prime number. 


Lemma 2.2.3. Every natural number greater than 1 has a prime divisor. 


The following is a statement that clearly implies the above lemma. Note that we 
employ the convention that a single prime number is a “product of primes” where 
the product has only one factor. 


Theorem 2.2.4. Every natural number other than I is a product of prime numbers. 


Proof. We prove this theorem using the Generalized Principle of Complete Mathe- 
matical Induction starting at 2. Let S be the set of all n that are products of primes. 
It is clear that 2 is in S, since 2 is a prime. Suppose that every natural number from 
2 up through k is in S. We must show, in order to apply the Generalized Principle of 
Complete Mathematical Induction, that k + 1 isin S. 

The number k + 1 cannot be 1. We must therefore show that either it is prime or 
is a product of primes. If k + 1 is prime, we are done. If k + 1 is not prime, then 
k + 1 = xy where each of x and y is a natural number strictly between 1 and k + 1. 
Thus x and y are each at most k, so, by the inductive hypothesis, x and y are both in 
S. That is, x and y are each either primes or the product of primes. Therefore, xy can 
be written as a product of primes by writing the product of the primes comprising x 
(or x itself if x is prime) times the product of the primes comprising y (or y itself if y 
is prime). Thus, by the Generalized Principle of Complete Mathematical Induction 
starting at 2, S contains all natural numbers greater than or equal to 2. oO 
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We now describe an interesting theorem that is a little more difficult to under- 
stand. (If you find this theorem too difficult, you need not consider it; it won’t be 
used in anything that follows. You might wish to return to it at some later time.) 

We begin by describing the case where n = 5. Suppose there is a pile of 5 
stones. We are going to consider the sum of certain sequences of numbers obtained 
as follows. Begin one such sequence by dividing the pile into two smaller piles, a 
pile of 3 stones and a pile of 2 stones. Let the first term in the sum be 3 - 2 = 6. 
Repeat this process with the pile of 3 stones: divide it into a pile of 2 stones and a 
pile consisting of 1 stone. Add 2 - 1 = 2 to the sum. The pile with 2 stones can be 
divided into 2 piles of 1 stone each. Add | - | = 1 to the sum. Now go back to the 
pile of 2 stones created by the first division. That pile can be divided into 2 piles of 
1 stone each. Add 1 - | = | to the sum. The total sum that we have is 10. 

Let’s create another sum in a similar manner but starting a different way. Divide 
the original pile of 5 stones into a pile of 4 stones and a pile of 1 stone. Begin this 
sum with 4- | = 4. Divide the pile of 4 stones into two piles of 2 stones each and 
add 2 - 2 = 4 to the sum. The first pile of 2 stones can be divided into two piles of 
1 stone each, so add 1 - 1 = 1 to the sum. Similarly, divide the second pile of 2 into 
two piles of 1 each and add 1 - 1 = 1 to the sum. The sum we get proceeding in this 
way is also 10. 

Is it a coincidence that we got the same result, 10, for the sums we obtained in 
quite different ways? 


Theorem 2.2.5. For any natural number n greater than 1, consider a pile of n 
stones. Create a sum as follows: Divide the given pile of stones into two smaller 
piles. Let the product of the number of elements in one smaller pile and the number 
of elements in the other smaller pile be the first term in the sum. Then consider 
one of the smaller piles and (unless it consists of only one stone) divide that pile 
into two smaller piles and let the product of the number of stones in those piles 
be the second term in the sum. Do the same for the other smaller pile. Continue 
dividing, multiplying, and adding terms to the sum in all possible ways. No matter 
what sequence of divisions into subpiles is used, the total sum is n(n — 1)/2. 


Proof. We prove this theorem using Generalized Complete Mathematical Induction 
beginning with n = 2. Given any pile of 2 stones, there is only one way to divide 
it: into two piles of | each. Since 1 - 1 = 1, the sum is 1| in this case. Notice that 
1 = 2(2 — 1)/2, so the formula holds for the case n = 2. 

Suppose now that the formula holds for all of n = 2,3,4,...,k. Consider any 
pile of k+ 1 stones. Note that k+ 1 is at least 3. We must show that for any sequence 
of divisions, the resulting sum is (k + 1)(k + 1 — 1)/2 = k(k + 1)/2. 

Begin with any division of the pile into two subpiles. Call the number of stones 
in the subpiles x and y respectively. Consider first the situation where x = 1. Then 
the first term in the sum is 1 - y = y. Since x = 1 andx + y =k +1, we know that 
y = k. The process is continued by dividing the pile of y stones. By the inductive 
hypothesis (since y = k, which is greater than or equal to 2), the sum obtained by 
completing the process on a pile of y stones is y(y — 1)/2. Thus, the total sum for 
the original pile of k + 1 stones in this case is 


2.2 The Principle of Complete Mathematical Induction 19 


OST AVS). ee VD: ae) 
2 2 2 2 2 


y+ 


If y = 1, the same proof can be given by simply interchanging the roles of x and y 
in the previous paragraph. 

The last, and most interesting, case is when neither x nor y is |. In this case, 
both x and y are greater than or equal to 2 and less than k. The first term in the 
sum is then xy. Continuing the process will give a total sum that is equal to xy 
plus the sum for the pile of x stones added to the sum for the pile of y stones. 
Therefore, using the inductive hypothesis, the sum for the original pile of k + 1 
stones is xy + x(x — 1)/2+ y(y — 1)/2. We must show that this sum is k(k + 1)/2. 

Recall thatk+1=x+y,sox =k+1-—y. Using this, we see that 
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This completes the proof. oO 


Mathematics is the most precise of subjects. However, human beings are not 
always so precise; they must be careful not to make mistakes. See if you can figure 
out what is wrong with the “proof” of the following obviously false statement. 


False Statement. All human beings are the same age. 


“Proof”. We will present what, at first glance at least, appears to be a proof of the 
above statement. We begin by reformulating it as follows: For every natural number 
n, every set of n people consists of people the same age. The assertion that “all 
human beings are the same age” would clearly follow from the case where n is the 
present population of the earth. We proceed by mathematical induction. The case 
n = | is certainly true; a set containing | person consists of people the same age. 
For the inductive step, suppose that every set of k people consists of people the same 
age. Let S be any set containing k + 1 people. We must show that all the people in 
S are the same age as each other. 
List the people in S as follows: 


S = {Pi, Po,..., Pr, Prot} 
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Consider the subset L of S consisting of the first k people in S; that is, 
L={P\, Po,..., Pr} 
Similarly, let R denote the subset consisting of the last k elements of S; that is, 
R= {Po,..., Pe, Presi} 


The sets L and R each contain k people, and so by the inductive hypothesis each 
consists of people who are the same age as each other. In particular, all the people in 
L are the same age as P>. Also, all the people in R are the same age as P. But every 
person in the original set S is in either L or R, so all the people in S are the same 
age as P). Therefore, S consists of people the same age, and the assertion follows 
by the Principle of Mathematical Induction. 


What is going on? Is it really true that all people are the same age? Not likely. Is 
the Principle of Mathematical Induction flawed? Or is there something wrong with 
the above “proof”? 

Clearly there must be something wrong with the “proof.” Please do not read 
further for at least a few minutes while you try to find the mistake. 

Wait a minute. Before you read further, please try for a little bit longer to see if 
you can find the mistake. 

If you haven’t been able to find the error yourself, perhaps a hint will help. The 
proof of the case n = | is surely valid; a set with one person in it contains a person 
with whatever age that person is. What about the inductive step, going from k to 
k + 1? For it to be valid, it must apply for every natural number k. To conclude that 
an assertion holds for all natural numbers given that it holds for n = 1 requires that 
its truth for n = k + | is implied by its truth for n = k, for every natural number k. 
In fact, there is a k for which the above derivation of the case n = k + 1 from the 
case n = k is not valid. Can you figure out the value of that k? 

Okay, here is the mistake. Consider the inductive step when k = 1; that is, going 
from | to 2. In this case, the set § would have the form S = {P,, Po}. Then, L = 
{P,} and R = {P}. 

The set L does consist of people the same age as each other, as does the set R. 
But there is no person who is in both sets. Thus, we cannot conclude that everyone 
in S is the same age. This shows that the above “proof” of the inductive step does 
not hold when k = 1. In fact, the following is true. 


True Statement. If every pair of people in a given set of people consists of people 
the same age, then all the people in the set are the same age. 


Proof. Let S be the given set of people; suppose S = {P}, P2,..., Pn}. For each i 
from 2 to n, the pair {P;, P;} consists of people the same age, by hypothesis. Thus, 
P; and P; are the same age for every i, so every person in S is the same age as P}. 
Hence, everyone in S is the same age. Oo 
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2.3 Problems 
Basic Exercises 


1. Prove, using induction, that for every natural number n: 


n(n + 1)(n 4+ 2) 


1-24+2-343-4+---+n-(24+]l= 3 


2. Prove, using induction, that for every natural number n: 


ea oe eee 1 on 
La 324 n-(n+1) n+] 


3. Prove, using induction, that for every natural number n: 
2427423 4...42% = grt! 2 
4. Prove, using induction, that for every natural number 7: 
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Interesting Problems 


5. Prove the following statement by induction: For every natural number n, every 
set with n elements has 2” subsets. (A subset of a given set is a set all of whose 
elements are elements of the given set. Note that the empty set, the set consisting 
of no elements, is a subset of every set. See Section 10.1 for more information 
about sets.) 

6. Prove, using induction, that for every natural number n: 


kee ee ee eee 
— —-oe-: — < n 


. Prove by induction that 3 divides n? + 2n, for every natural number n. 

. Show that 3” > n? for every natural number n. 

. Use induction to prove that 2” > n*, for every n > 4. 

. Show that, for every natural number n > 1 and every real number r different 
from 1, 


= 
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Challenging Problems 


11 


12. 


13. 


14. 


15. 


16. 


. Prove the Principle of Complete Mathematical Induction using the Well- 


Ordering Principle. 
Prove the Well-Ordering Principle using the Principle of Complete Mathemati- 
cal Induction. 
One version of a game called Nim is played as follows. There are two players 
and two piles consisting of the same natural number of objects; for this example, 
suppose the objects are nickels. At each turn, a player removes some number of 
nickels from either one of the piles. Then the other player removes some number 
of nickels from either of the piles. The players continue playing alternately until 
the last nickel is removed. The winner is the player who removes the last nickel. 
Prove: If the second player always removes the same number of nickels that 
the first player last removed and does so from the other pile (thus making the 
piles equal in number after the second player’s turn), then the second player will 
win. 
Define the n" Fermat number, Fy, by F, = 22' + 1 forn = 0, 1, 2,3,.... The 
first few Fermat numbers are Fo = 3, Fy = 5, Fo = 17, F3 = 257. 


(a) Prove by induction that Fo: Fy --- Fp-1 + 2 = Fy, forn > 1. 

(b) Use the formula in part (a) to prove that there are an infinite number of 
primes, by showing that no two Fermat numbers have any prime factors in 
common. 

[Hint: For each Fy, let py be a prime divisor of F;, and show that pn, A Pno 
ifn, ~n2.] 


The sequence of Fibonacci numbers is defined as follows: x; = 1, x2 = 1, and, 
forn > 2, xX) = Xn—1 + Xn—2. Prove that 


7 I 75 n mee n 
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for every natural number n. 


[Hint: Use the fact that x = iis and x = vane both satisfy 1 + x = x] 
Prove the following generalization of Theorem 2.1.6: 


Theorem. For each natural number n, consider a square consisting of 27" 
smaller squares (i.e, a 2” x 2" square). If any one of the smaller squares 
is removed from the large square (not necessarily from the corner), then the 
resulting region can be completely covered by trominos (each made up of three 
small squares of the same size as the small squares in the region) in such a way 
that the trominos do not overlap. 


Chapter 3 Mm) 
Modular Arithmetic ml 


Consider the number obtained by adding 3 to the number consisting of 2 to the 
power 3,000,005; that is, consider the number 3 + 23,000,005 This is a very big 
number. Common calculators cannot deal with a number this big. 

When that huge number is divided by 7, what remainder is left? You can’t use 
your calculator because it can’t count that high. However, this and similar questions 
are easily answered using a kind of “calculus” of divisibility and remainders that is 
called modular arithmetic. Another application of this concept will be a proof that a 
natural number is divisible by 9 if and only if the sum of its digits is divisible by 9. 
The mathematics that we develop in this chapter has numerous other applications, 
including, for example, providing the basis for an extremely powerful method for 
sending coded messages (see Chapter 6). 


3.1 The Basics 


Recall that we say that the integer n is divisible by the integer m if there exists an 
integer g such that n = mq. In this situation, we also say that m is a divisor of n, or 
m is a factor of n. 

The definition that is fundamental for modular arithmetic is the following. 


Definition 3.1.1. For any fixed natural number m greater than 1, we say that the 
integer a is congruent to the integer b modulo m if a — b is divisible by m. We use 
the notation a = b (mod m) to denote this relationship. The number m in this 
notation is called the modulus. 


Here are a few examples: 


14 =8 (mod 3), since 14 — 8 = 6 is divisible by 3 
252 = 127 (mod 5), since 252 — 127 = 125 is divisible by 5 
=—11 (mod 7), since 3 — (—11) = 14 is divisible by 7 
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Congruence shares an important property with equality. 
Theorem 3.1.2. [fa = b (mod m) andb=c (mod m), thena =c (mod m). 


Proof. The hypothesis states that a — b and b — c are both divisible by m; that is, 
there are integers f and s such that a — b = tm and b—c = sm. Thus,a —c = 
a—b+b—c=tm+sm = (t+5)m. In other words, a — c is divisible by m. By 
definition, then, a = c (mod m). oO 


The theorem just proven shows that we can replace numbers in a congruence 
modulo m by any numbers congruent to them modulo m. 

Although the modulus m must be bigger than 1, there is no such restriction on the 
integers a and b; they could even be negative. In the case where a and b are positive 
integers, the relationship a = b (mod m) can be expressed in more familiar terms. 


Theorem 3.1.3. When a and b are nonnegative integers, the relationship a = b 
(mod m) is equivalent to a and b leaving equal remainders upon division by m. 


Proof. Consider dividing m into a; if it “goes in evenly,” then m is a divisor of a and 
the remainder r is 0. In any case, there are nonnegative integers g and r such that 
a =qm-+r;q is the quotient and r is the remainder. The nonnegative number r is 
less than m, since it is the remainder. Similarly, divide b by m, getting b = qgm +10 
for some go and some ro. This yields 


a—b=(qm+r)— (qgom+ro) =m(q — go) + (r — 10) 


Ifr = ro, then a — b is obviously divisible by m, soa = b (mod m). Conversely, if 
r is not equal to ro, note that r — ro cannot be a multiple of m. (This follows from 
the fact that r and ro are both nonnegative numbers which are strictly less than m.) 
Thus, a—b is a multiple of m plus a number that is not a multiple of m, and therefore 
a — bis nota multiple of m. That is, it is not the case that a = b (mod m). oO 


A special case of the above theorem is that a positive number is congruent modulo 
m to the remainder it leaves upon division by m. The possible remainders that arise 
from division by a given natural number m are 0,1, 2,...,m — 1. 


Theorem 3.1.4. For a given modulus m, each integer is congruent to exactly one 
of the numbers in the set {0, 1,2,...,m — 1}. 


Proof. Let a be an integer. If a is positive, the result follows from the fact, discussed 
above, that a is congruent to the remainder it leaves upon division by m. If a is not 
positive, choosing t big enough would make tm + a positive. For such at, tm + a 
is congruent to the remainder it leaves upon division by m. But also tm +a =a 
(mod m). It follows from Theorem 3.1.2 that a is congruent to the remainder that 
tm + a leaves upon division by m. An integer cannot be congruent to two different 
numbers in the given set {0, 1, 2,..., — 1}, since no two numbers in the set are 
congruent to each other (by Theorem 3.1.3). oO 


For a fixed modulus, congruences have some properties that are similar to those 
for equations. 
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Theorem 3.1.5. [fa = b (mod m) andc =d (mod m), then 


(i) (a+c) =(b+d) (mod m), and 
(ii) ac = bd (mod m). 


Proof. To prove (i), note that a = b (mod m) means that a — b = sm for some 
integer s. Similarly, c— d = tm for some integer t. The conclusion we are trying to 
establish is equivalent to the assertion that (a +c) — (b+ d) is a multiple of m. But 
(a+c) —(b+d) = (a—b) + (c — d), which is equal to sm + tm = (s + t)m, so 
the result follows. 

To prove (ii), note that from a — b = sm andc —d = tm, we geta =b+sm 
andc =d+tm,so 


ac = (b+ sm)(d + tm) = bd + btm +. smd + stm” 


It follows that ac — bd = m(bt + sd + stm), so ac — bd is a multiple of m and the 
result is established. oO 


Theorem 3.1.5 tells us that congruences are similar to equations in that you can 
add congruent numbers to both sides of a congruence or multiply both sides of a 
congruence by congruent numbers and preserve the congruence, as long as all the 
congruences are with respect to the same fixed modulus. 

For example, since 3 = 28 (mod 5) and 17 = 2 (mod 5), it follows that 20 = 
30 (mod 5) and 51 = 56 (mod 5). 

Here is another example: 8 = 1 (mod 7), so 82 = 1? (mod 7), or 8* = 1 
(mod 7). It follows that 82 - 8 = 1-1 (mod 7), or 8° = 1 (mod 7). In fact, all 
positive integer powers of 8 are congruent to | modulo 7. This is a special case of 
the next result. 


Theorem 3.1.6. [f a = b (mod m), then a” = b” (mod m), for every natural 
number n. 


Proof. We use the Principle of Mathematical Induction. The case n = | is the 
hypothesis. Assume that the result is true for n = k; that is, a = b* (mod m). 
Since a = b (mod m), part (ii) of Theorem 3.1.5 gives a - ak = b-b* (mod m), or 
akt] = pktl (mod m). Thus, the statement is true for allm > 1 by induction. oO 


3.2 Some Applications 


We can use the above to easily solve the problem that we mentioned at the beginning 
of this chapter: What is the remainder left when 3 + 23-90-05 is divided by 7? 
First note that 27 = 8 is congruent to 1 modulo 7. Therefore, by Theorem 3.1.6, 
23)1,000,000 is congruent to 11:909000 modulo 7. But 11-90.000 — 1, Thus 
23,000,000 — 1 (mod 7). Since 2° = 4 (mod 7) and 27:000-005 — 23,000,000 | 95. 
it follows that 23-09-005 = 4 (mod 7). Thus, 3 + 23:00.005 = 344 = 0 (mod 7). 
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Therefore, 7 is a divisor of 3 + 23:99. Tn other words, the remainder that is left 
when 3 + 23000;005 is divided by 7 is 0. 

Let’s look at the next question we mentioned at the beginning of this chapter, the 
relationship between divisibility by 9 of a number and divisibility by 9 of the sum of 
the digits of the number. To illustrate, we begin with a particular example. Consider 
the number 73,486. What that really means is 


7-10°+3-10°+4-10°+8-10+6 


Note that 10 is congruent to 1 modulo 9, so 10” is congruent to 1 modulo 9 for every 
natural number n. Thus, a- 10” =a (mod 9) for every a and every n. It follows that 
7-104+3-103+4-10?+8-10+6 is congruent to (7+3+4+8+6) modulo 9. Thus, 
the number 73,486 and the sum of its digits are congruent to each other modulo 9 
and therefore leave the same remainders upon division by 9. The general theorem is 
the following. 


Theorem 3.2.1. Every natural number is congruent to the sum of its digits mod- 
ulo 9. In particular, a natural number is divisible by 9 if and only if the sum of its 
digits is divisible by 9. 


Proof. If n is a natural number, then we can write it in terms of its digits in the form 
Akak—|Ak—2 ...a\ag (note that this is a listing of digits, not a product of digits), 
where each a; is one of 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 (with ax ~ 0). That is, ag is the 
digit in the “1’s place,” a is the digit in the “10’s place,” az is the digit in the “100’s 
place,’ and so on. (In the previous example, n was the number 73,486, so in that 
case a4 = 7, a3 = 3, a2 = 4, a, = 8, and ag = 6.) This means that 


We AO Dae sO! Bae 10 a 1 Pe 1 ee 


As shown above, 10 = | (mod 9) implies 10' = 1 (mod 9), for every positive 
integer i. Therefore, n is congruent to (az + ag_—1 + ag_—2 +--+ a1 +a) modulo 9. 
Thus, n and the sum of its digits leave the same remainders upon division by 9. In 
particular, n is divisible by 9 if and only if the sum of its digits is divisible by9. O 


Congruences with unknowns can easily be solved by just trying all possibilities 
if the modulus is small. 


Example 3.2.2. Find a solution to the congruence 5x = 11 (mod 19). 


Solution. If there is a solution, then, by Theorem 3.1.4, there is a solution within 
the set {0, 1, 2,..., 18}. If x = 0, then 5x = 0, so 0 is not a solution. Similarly, for 
x =1,5x = 5; for x = 2,5x = 10; for x = 3, 5x = 15; and for x = 4, 5x = 20. 
None of these are congruent to 11 (mod 19), so we have not yet found a solution. 
However, when x = 6, 5x = 30, which is congruent to 11 (mod 19). Thus, x = 6 
(mod 19) is a solution of the congruence. oO 
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Example 3.2.3. Show that there is no solution to the congruence x? = 3 (mod 5). 


Proof, If x = 0, then x* = 0; if x = 1, then x? = 1; if x = 2, then x* = 4; if 
x = 3, then x? = 9, which is congruent to 4 (mod 5); and if x = 4, then x? = 16 
which is congruent to | (mod 5). If there was any solution, it would be congruent 
to one of {0, 1, 2, 3, 4} by Theorem 3.1.4. Thus, the congruence has no solution. O 


3.3. Problems 


Basic Exercises 


1. Find a solution x to each of the following congruences. (“Solution” means 
integer solution.) 


(a) 2x =7 (mod 11) 
(b) 7x =4 (mod 11) 
(c) x° =3 (mod 4) 


2. For each of the following congruences, either find a solution or prove that no 
solution exists. 


(a) 39x = 13 (mod 5) 

(b) 95x = 13 (mod 5) 

(c) x2 =3 (mod 6) 

(d) 5x? = 12 (mod 8) 

(e) 4x3 + 2x =7 (mod 5) 


Interesting Problems 


3. Find the remainder when: 


(a) 37463 is divided by 8 

(b) 29 is divided by 15 

(c) 24319! is divided by 8 

(d) 5700! + (27)! is divided by 8 

(e) (—8)*174 + 6310! + 7° is divided by 3 
(f) 710 + 64° is divided by 3 

(g) 5!- 181 — 866 - 332 is divided by 6 


4, Is 2°°8 + 3 divisible by 15? 

5. Find a digit b such that the number 279452 is divisible by 8. 

6. Determine whether or not 177497 + 25376 + 5782 is divisible by 3. 

7. Suppose that 77 is written out in the ordinary way. What is its last digit? 


28 
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. Determine whether or not the following congruence has a natural number 


solution: 


5* +3 =5 (mod 100) 


. Prove that n* — 1 is divisible by 8, for every odd integer n. 
. Prove that a natural number is divisible by 3 if and only if the sum of its digits 


is divisible by 3. 


11. Prove that x> = x (mod 10), for every integer x. (This shows that x° and x 
have the same units’ digit for every integer x.) 

12. Suppose a number is written as abba, where a and b are any integers from | to 
9. Prove that this number is divisible by 11. 

13. Find the units’ digit of 27493°7°?. 

14. Show that if m is a natural number and a is a negative integer, then there exists 
an r with 0 < r < m-— 1 and an integer g such that a = qm +r. (See the proof 
of Theorem 3.1.3.) 

15. Prove that, for every pair of natural numbers m and n, m* is congruent to n? 
modulo (m +n). 

Challenging Problems 

16. Prove that 5 divides 3°"+! + 27”+1, for every natural number n. 

17. Prove that 7 divides 87”’+! + 67”+!, for every natural number n. 

18. Prove that a natural number that is congruent to 2 modulo 3 has a prime factor 
that is congruent to 2 modulo 3. 

19. If m is a natural number greater than | and is not prime, then we know that 
m = ab, where | < a < mand 1 < b < m. Show that there is no integer x 
such that ax = 1 (mod m). (That is, a has no multiplicative inverse modulo m. 
The situation is different if m is prime: see Problem 7 in Chapter 4.) 

20. Prove that 133 divides 11"+! + 127”—!, for every natural number n. 

21. A natural number r less than or equal to m — | is called a quadratic residue 
modulo m if there is an integer x such that x2 =r (mod m). Determine all the 
quadratic residues modulo 11. 

22. Show that there do not exist natural numbers x and y such that x7 + y? = 4003. 
[Hint: Begin by determining which of the numbers {0, 1, 2, 3} can be congruent 
tox? (mod 4).] 

23. Discover and prove a theorem determining whether a natural number is divisible 
by 11, in terms of its digits. 

24. Prove that there are an infinite number of primes of the form 4k + 3 with k a 


natural number. 
(Hint: If pi, p2,..., Pn aren such primes, show that (4- p1 - p2--- Pn) — | has 
at least one prime divisor of the given form.] 


3.3 


25. 


26. 


27. 


28. 


29. 


30. 


Problems 29 


Prove that there are an infinite number of primes of the form 6k + 5 with k a 
natural number. 

Prove that every prime number greater than 3 differs by | from a multiple of 6. 
Show that, if x, y, and z are integers such that xe ye — Zz, then at least one of 
{x, y, z} is divisible by 2, at least one of {x, y, z} is divisible by 3, and at least 
one of {x, y, z} is divisible by 5. 

Let p(x) be a non-constant polynomial with integer coefficients. (That is, there 
exists a natural number n and integers a; such that p(x) = ayx”"” + ay—yx"—! + 
+++ + a,x + ao and a, 4 0.) Let a, k, and m be integers with m > 1. Suppose 
that p(a) =k (mod m). Prove that p(a +m) =k (mod m). 

Show that the polynomial p(x) = x” — x + 41 takes prime values for every x 
in the set {0, 1,2,..., 40}. 

Show that there does not exist any non-constant polynomial p(x) with integer 
coefficients such that p(x) is a prime number for all natural numbers x. 


Chapter 4 
The Fundamental Theorem of Arithmetic = x 


Is 13217 . 3792 . 4115 = 19111 . 29145 . 4312 . 4759 

We have seen that every natural number greater than | is either a prime or a 
product of primes. The above equation, if it was an equation, would express a 
number in two different ways as a product of primes. Does the representation of 
a natural number as a product of primes have to be unique? The answer is obviously 
“no” in one sense. For example, 6 = 3-2 = 2-3. Thus, the same number can 
be written in two different ways as a product of primes if we consider different 
orders as “different ways.” But suppose that we don’t consider the ordering; must 
the factorization of a natural number into a product of primes be unique except for 
the order? For example, could the above equation hold? 

In fact, every natural number other than | has a factorization into a product 
of primes and the factorization is unique except for the order. This result is so 
important that it is called the Fundamental Theorem of Arithmetic. We will give 
two proofs. The second proof requires a little more development and will be given 
later (Theorem 7.2.4). The first proof is short but tricky. 


4.1 Proof of the Fundamental Theorem of Arithmetic 


In order to simplify the statement of the Fundamental Theorem of Arithmetic, we 
use the expression “‘a product of primes” to include the case of a single prime 
number (as we did in Theorem 2.2.4). 


The Fundamental Theorem of Arithmetic 4.1.1. Every natural number greater 
than I can be written as a product of primes, and the expression of a number as 
a product of primes is unique except for the order of the factors. 


Proof. We have already established that every natural number greater than 1 can 
be written as a product of primes (see Theorem 2.2.4). That is the easy part of the 
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Fundamental Theorem of Arithmetic; the harder part is the uniqueness. The proof 
of uniqueness that we present below is a proof by contradiction. That is, we will 
assume that there is a natural number with more than one representation as a product 
of primes and derive a contradiction from this assumption, thereby showing that this 
assumption is incorrect. 

Suppose, then, that there is at least one natural number greater than 1 with at 
least two different representations as a product of primes. By the Well-Ordering 
Principle (2.1.2), there would then be a smallest natural number with that property 
(i.e., a smallest natural number that has at least two different such representations). 
Let N be that smallest such number. Write out two different factorizations of NV: 


where each of the p; and the qg; are primes (there can be repetitions of the same 
prime). Notice that N cannot be prime, since there is only one way to express a 
prime, so r and s are both bigger than 1. 

We first claim that no p; could be equal to any g;. This follows from the fact 
that N is the smallest number with a non-unique representation, for if pj = g; for 
some i and j, that common factor could be divided from both of the two different 
factorizations for N, producing a smaller number that has at least two different 
factorizations. Thus, no p; is equal to any qj. 

Since pj is different from q, one of p; and q is less than the other; suppose that 
P1 is less than q}. (If q1 is less than p1, the same proof could be repeated by simply 
interchanging the p’s and q’s.) Define M by 


M=N — (piq2--- 4s) 


Then M is a natural number that is less than NV. Substituting the product p, p2--- p; 
for N gives 


M = (pip2--: Pr) — (p1g2+++ 4s) = Pil (p2- ++ pr) — (q2-++9s)] 


from which it follows that p; divides M. In particular, M is not 1. Since M is less 
than NV, M has a unique factorization into primes. 

Substituting the product giqg2--- qs for N in the definition of M gives a different 
expression: 


M = (4192-+- 4s) — (P192+++ 4s) = (41 — P1)(q2-+- 4s) 


The unique factorization of M into primes can thus be obtained by writing the 
unique factorization of gj — p; followed by the product q2 - - - gs. On the other hand, 
the fact that p; is a divisor of M implies that p; must appear in the factorization of 
M into primes. Since p, is distinct from each of {g2, ..., gs}, it follows that p; must 
occur in the factorization of gj — pj; into primes. Thus, g1 — pi = pik, for some 
natural number k. It follows that g; = py + pik = p,}(1 +4), which shows that q, 
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is divisible by p;. Since p; and qj are distinct primes, this is impossible. Hence, the 
assumption that there is a natural number with two distinct factorizations leads to a 
contradiction, so factorizations into primes are unique. Oo 


The Fundamental Theorem of Arithmetic gives a so-called “canonical form” for 
expressing each natural number greater than 1. 


Corollary 4.1.2. Every natural number N greater than 1 has a canonical factor- 
ization into primes; that is, each natural number N greater than I has a unique 
representation of the form N = P, oe -+» Dn", where each pj; is a prime, p; is less 
than pi+. for each i, and each a; is a natural number. 


Proof. To see this, simply factor the given number as a product of primes and then 
collect all occurrences of the smallest prime together, then all the occurrences of the 
next smallest prime, and so on. Oo 


For example, the canonical form of 60,368 is 24.73.11. The canonical form of 
19 is simply 19. 

As we will see, the following corollary of the Fundamental Theorem of Arith- 
metic is very useful. (If the corollary below is independently established, then it 
is easy to derive the Fundamental Theorem of Arithmetic from it. In fact, most 
presentations of the proof of the Fundamental Theorem of Arithmetic use this 
approach rather than the shorter but trickier proof that we gave above. We will 
present such a proof later (Theorem 7.2.4).) 


Corollary 4.1.3. If p is a prime number and a and b are natural numbers such that 
p divides ab, then p divides at least one of a and b. (That is, if a prime divides a 
product, then it divides at least one of the factors.) 


Proof. Since p divides ab, there is some natural number d such that ab = pd. The 
unique factorization of ab into primes therefore contains the prime p and all the 
primes that divide d. On the other hand, a and b each have unique factorizations 
into primes. Let the canonical factorization of a be g{'q5°---gqm" and of b be 


rb .. rh" Then, 
aba alas 


Since the factorization of ab into primes is unique, p must occur either as one of 
the qg;’s, in which case p divides a, or as one of the r;’s, in which case p divides b. 
Thus, p divides at least one of a and b, and the corollary is established. Oo 


It should be noted that this corollary does not generally hold for divisors that are 
not prime. For example, 18 divides 3 - 12, but 18 does not divide 3 and 18 does not 
divide 12. 
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4.2 Problems 
Basic Exercises 


1. Find the canonical factorization into primes of each of the following: 


(a) 52 (e) 122.54 
(b) 72 (f) 112 

(c) 47 (g) 224 

(d) 625 (h) 112 +224 


2. Find natural numbers x, y, and z such that 


(a) 3*- 100-57 =9- 10% -5 
(b) 50-2”. 7% = 5*-23. 14 


3. Show that if p is a prime number and aj, a2, ..., @, are natural numbers such 
that p divides the product aja2---a,, then p divides a; for at least one a;. 

4. Show that if p is a prime number and a and n are natural numbers such that p 
divides a”, then p divides a. 


Interesting Problems 


5. Find the smallest natural numbers x and y such that 


(a) 7?x = 53y 
(b) 2x = 107y 
(c) 127x = 54y 


6. Find nonnegative integers w, x, y, and z such that 


17°2572% = 10*34"7” 


Challenging Problems 


7. Suppose that p is a prime number and p does not divide a. Prove that 
the congruence ax = 1 (mod p) has a solution. (This proves that a has a 
multiplicative inverse modulo p.) 

8. Prove that a natural number m greater than | is prime if m has the property that 
it divides at least one of a and b whenever it divides ab. 

9. Prove that, for every prime number p, x7 = 1 (mod p) implies x = 1 (mod p) 
or x = (p — 1) (mod p). 
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10. Suppose that a and b are natural numbers whose prime factorizations have 


11. 


no primes in common (the pair a, b is then said to be relatively prime; see 
Definition 7.2.1). Show that, for any natural number m, the product ab divides 
m if each of a and b divides m. 
Using the result of Problem 10: 


(a) Prove that 42 divides n’ — n, for every natural number n. 
(b) Prove that 21 divides 3n’ + 7n? + 11n, for every natural number n. 


Chapter 5 ®) 
Fermat’s Little Theorem and Wilson’s ml 
Theorem 


We’ve seen that we can add or multiply “both sides” of a congruence by congruent 
numbers and the result will be a congruence (Theorem 3.1.5). What about dividing 
both sides of a congruence by the same natural number? For the result to have a 
chance of being a congruence, the divisor must divide evenly into both sides of the 
congruence so that the result involves only integers, not fractions (congruences are 
only defined for integers). However, even that condition is not sufficient to ensure 
that the result will be a congruence. For example, 6 - 2 is congruent to 6 - 1 modulo 
3, but 2 is not congruent to 1 modulo 3. This is not a surprising example, since 6 
is congruent to 0 modulo 3, so “dividing both sides” of the above congruence by 
6 is like dividing by 0, which gives wrong results for equations as well. However, 
there are also examples where dividing both sides of a congruence by a number that 
is not congruent to 0 leads to results that are not congruent. For example, 12 - 3 is 
congruent to 24 - 3 modulo 9, but 12 is not congruent to 24 modulo 9, in spite of the 
fact that 3 is not congruent to 0 modulo 9. 

However, there are important cases in which we can divide both sides of a 
congruence and be assured that the result is a congruence. Analyzing these cases 
produces some very interesting theorems. 


5.1 Fermat’s Little Theorem 


Theorem 5.1.1. [fp isa prime and a is not divisible by p, andifab = ac (mod p), 
then b = c (mod p). (That is, we can divide both sides of a congruence modulo a 
prime by any natural number that divides both sides of the congruence and is not 
divisible by the prime.) 


Proof. We are given that p divides ab—ac. This is the same as saying that p divides 
a(b — c). Corollary 4.1.3 shows that, since p divides a(b — c), p must also divide 
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either a or b — c. Since the hypothesis states that a is not divisible by p, this implies 
that b — c must be divisible by p. That is the same as saying b = c (mod p). Oo 


Consider any given prime number p. The possible remainders when a natural 
number is divided by p are the numbers {0, 1,..., p — 1}. By Theorem 3.1.4, no 
two of these numbers are congruent to each other and every natural number (in fact, 
every integer) is congruent modulo p to one of those numbers. An integer is divisible 
by p if and only if it is congruent to 0 modulo p. Thus, each integer that is not 
divisible by p is congruent to exactly one of the numbers in the set {1, 2,..., p—l}. 
This is the basis for the proof of the following beautiful, and very useful, theorem. 


Fermat’s Little Theorem 5.1.2. If p is a prime number and a is any natural 
number that is not divisible by p, then a?~! = 1 (mod p). 


Proof. Let p be any prime number and let a be any natural number that is not 
divisible by p. Consider the set of numbers {a - 1,a-2,...,a-(p—1)}. First 


note that no two of those numbers are congruent to each other, for if as = at 
(mod p), then, by Theorem 5.1.1, s = t (mod p). Since no two of the numbers 
in the set {1,2,..., p — 1} are congruent to each other, this shows that the same 
is true of the numbers in the set {a- 1,a-2,...,a-(p— 1)}. Also note that each 
of the numbers in the set {a-1,a-2,...,a-(p— 1)} is congruent to one of the 
numbers in {1, 2,..., p — 1} since no number in either set is divisible by p. Thus, 
the numbers in the set {a- 1,a-2,...,a-(p— 1)} are congruent, in some order, 
to the numbers in the set {1,2,..., p — 1}. This implies that the product of all of 
the numbers in the set {a-1,a-2,...,a-(p— 1)} is congruent modulo p to the 


product of all the numbers in {1, 2,..., p — 1}. Thus,a-1-a-2---a-(p—1)is 
congruent to | -2-3---(p — 1) modulo p. Since the number a occurs p — | times 
in this congruence, this yields QP 8 Bee (p-)) =d-2-3---(~-D) 
(mod p). Clearly, p does not divide 1 -2-3---(p — 1) (by repeated application of 
Corollary 4.1.3). Thus, by Theorem 5.1.1, we can “divide” both sides of the above 
congruence by 1 - 2-3---(p — 1), yielding a?~'! = 1 (mod p). Oo 


As we shall see, Fermat’s Little Theorem has important applications, including 
in establishing a method for sending coded messages. It is also sometimes useful to 
apply Fermat’s Little Theorem to specific cases. For example, 88!° — 1 is divisible 
by 101. (Don’t try to verify this on your calculator!) 

The following corollary of Fermat’s Little Theorem is sometimes useful since it 
doesn’t require that a not be divisible by p. 


Corollary 5.1.3. [f p is a prime number, then a? = a (mod p) for every natural 
number a. 


Proof. If p does not divide a, then Fermat’s Little Theorem states that a?~!=1 
(mod p). Multiplying both sides of this congruence by a gives the result in this 
case. On the other hand, if p does divide a, then p also divides a’, so a? and a are 
both congruent to 0 mod p. oO 
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Definition 5.1.4. A multiplicative inverse modulo p for a natural number a is a 
natural number b such that ab = 1 (mod p). 


Of course, if b is a multiplicative inverse of a modulo p, then so is any natural 
number that is congruent to b modulo p. 

Fermat’s Little Theorem provides one way of showing that all natural numbers 
that are not multiples of a given prime p have multiplicative inverses modulo p. 


Corollary 5.1.5. If p is a prime and a is a natural number that is not divisible by p, 
then there exists a natural number x such that ax = 1 (mod p). 


Proof. In the case where p = 2, each such a must be congruent to 1 modulo 2, so 
we can take x = 1. If p is greater than 2, then, for each given a, let x = a?~*. Then 
ax = a-a?~? =a? —' and, by Fermat’s Little Theorem, a?~! = 1 (mod p). oO 


It turns out to be interesting and useful to know which natural numbers are 
congruent to their own inverses modulo p. If x is such a number, then x -x = 1 
(mod p). In other words, such an x is a solution to the congruence x2 =1 (mod DP), 
or x* — 1 = 0 (mod p). The solutions of the equation x7 — 1 = 0 are x = 1 and 
x = —1. The solutions of the congruence are similar. 


Theorem 5.1.6. If p is a prime number and x is an integer satisfying x? = 1 
(mod p), then either x = 1 (mod p) orx = p—1 (mod p). (Note that p—1 = —1 
(mod p).) 


Proof. If x? = 1 (mod p), then, by definition, p divides x? — 1. But x7 — 1 = 
(x — 1)(x + 1). Since p divides x? — 1, Corollary 4.1.3 implies that p divides at 
least one of x — | and x + 1. If p divides x — 1, then x = 1 (mod p). If p divides 
x +1, then x = —1 (mod p), or x = p—1 (mod p). oO 


The following lemma is needed in the proof of Wilson’s Theorem (5.2.1). 


Lemma 5.1.7. [fa and c have a common multiplicative inverse modulo p, then a 
is congruent to c modulo p. 


Proof. Suppose ab = 1 (mod p) and cb = | (mod p). Then multiplying the 
second congruence on the right by a yields cha = a (mod p) and, since ba = 1 
(mod p), this gives c =a (mod p). oO 


5.2 Wilson’s Theorem 


As we now show, these considerations lead to a proof of Wilson’s Theorem, a 
theorem that is very beautiful, although it is considerably less famous and much 
less useful than Fermat’s Little Theorem. 


Wilson’s Theorem 5.2.1. /f p is a prime number, then (p — 1)!+ 1=0 (mod p). 
(In other words, if p is prime, then p divides (p — 1)! + 1.) 
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Proof. First note that the theorem is obviously true when p=2; in this case, it states 
(1+ 1) =0 (mod 2). In the following, we assume that p is a prime greater than 2. 

As we indicated above, a multiplicative inverse of an integer x modulo p is 
an integer y such that xy = 1 (mod p). As we have seen, every number in the 
set {1,2,..., p — 1} is distinct modulo p, and, by Corollary 5.1.5, each has a 
multiplicative inverse modulo p. Since no multiplicative inverse can be divisible 
by p, the multiplicative inverse of each number in {1,2,..., p — 1} is congruent 
to one of the numbers in {1, 2,..., p — 1}. By Theorem 5.1.6, the only numbers in 
the set {1,2,..., p — 1} that are congruent to their own multiplicative inverses are 
the numbers | and p — 1. Leave those two numbers aside for the moment. Note that 
if y is a multiplicative inverse of x, then x is a multiplicative inverse of y. Thus, 
the numbers in the set {2,3,..., p — 2} each have multiplicative inverses in that 
same set, and each number in that set differs from its multiplicative inverse. By 
Lemma 5.1.7, no two numbers in the set can have the same inverse. Therefore, 
we can arrange the numbers in the set {2,3,..., p — 2} in pairs consisting of 
a number and its multiplicative inverse. Since the product of a number and its 
multiplicative inverse is congruent to 1 modulo p, the product of all the numbers 
in the set {2, 3,..., p — 2} is congruent to | modulo p. Thus, 2-3---(p—2) = 1 
(mod p). Multiplying both sides by 1 gives 1-2-3---(p —2) =1 (mod p). Now 
p-—1=-1 (mod p),so1-2---(p—2)-(p—1) =1-(—1) (mod p). In other 
words, (p — 1)! = —1 (mod p), which yields (p — 1)!+ 1 =O (mod p). oO 


Theorem 5.2.2. [fm is a composite number larger than 4, then (m — 1)! = 0 
(mod m) (so that (m — 1)!+ 1 = 1 (mod m)). 


Proof. Let m be any composite number larger than 4. We must show that (m — 1)! 
is divisible by m. If m = ab, with a different from b and both less than m, then 
a and b each occur as distinct factors in (m — 1)!. Thus, m = ab is a factor of 
(m — 1)!, so (m — 1)! is congruent to 0 modulo m. The only composite numbers 
m that cannot be written as a product of two distinct natural numbers less than m 
are those numbers that are squares of primes. (To see this, use the fact that every 
composite can be written as a product of primes.) Thus, the only remaining case to 
prove is when m = p” for some prime p. In this case, if m is larger than 4, then p 
is a prime bigger than 2. In that case, p* is greater than 2p. Thus, p? — | is greater 
than or equal to 2p, so (p* — 1)! contains the factor 2p as well as the factor p. 
Therefore, (p? — 1)! contains the product 2p. In particular, (m — 1)! is divisible by 
m= p’. Oo 


The following combines Wilson’s Theorem and its converse. 


Theorem 5.2.3. If m is a natural number other than 1, then (m — 1)! +1 = 0 
(mod m) if and only if m is a prime number. 


Proof. This follows immediately from Wilson’s Theorem when m is prime and from 
the previous theorem for composite m in all cases except for m = 4. If m = 4, then 
(m — 1)! + 1 = 3!+ 1 = 7, which is not congruent to 0 modulo 4, so the theorem 
holds for all m. Oo 
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It might be thought that Wilson’s Theorem would provide a good way to check 


whether or not a given number m is prime: simply see whether m divides (m—1)!+1. 
However, the fact that (m—1)! is so much larger than m makes this a very impractical 
way of testing primality for large values of m. 


5.3. Problems 


Basic Exercises 


1. 
2. 


3. 


Find the remainder when 24! is divided by 103. 
Find a solution x to each of the following congruences: 


(a) 2* = 1 (mod 103) 
(b) 16!-x =5 (mod 17) 


Find the remainder when 99! — 1 is divided by 101. 


Interesting Problems 


4. 


10. 


Suppose that p is a prime greater than 2 and a = b* (mod p) for some natural 
p—1 
number b that is not divisible by p. Prove that az =1 (mod p). 


. Find three different prime factors of 10!* — 1. 
. Let p be a prime number. Prove that 17 - 27 . 37--- (p — 1)* — 1 is divisible by 


Dp. 


. For each of the following congruences, either find a solution or prove that no 


solution exists. 


(a) 102!-x +x =4 (mod 103) 
(b) x!©—2=0 (mod 17) 


. Find the remainder when: 


(a) (9!- 16 + 4311)8© is divided by 11 
(b) 42! + 7°8 + 66 is divided by 29 


. Ifa is a natural number and p is a prime number, show that a? + a-(p— 1)! is 


divisible by p. 
Find the remainder that 100 + 273 + 16! + 29! leaves upon division by 19. 


42 
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Challenging Problems 


11. 
12. 


13. 
14. 
15. 
16. 
17. 


18. 


Show that a natural number n > | is prime if and only if n divides (n — 2)! — 1. 
Show that, if p is a prime number and a and b are natural numbers, then 


(a+b)? =a? +b? (mod p) 


Prove that, for all primes p > 2, (p — 2)! =1 (mod p). 

Prove that, for all primes p > 3,2 - (p — 3)! =—I1 (mod p). 

Is there a prime number p such that (p — 1)! + 6 is divisible by p? 

Find all prime numbers p such that p divides (p — 2)! + 6. 

Suppose 2* + 1 is a prime number. Prove that k has no prime divisors other 
than 2. 

[Hint: If k = ab with b odd, consider 2 + 1 modulo 2¢ + 1.] 

Prove that aJ~! = 1 (mod pq) if p and q are distinct primes such that p — 1 
divides g — 1 and neither p nor g divides a. 


Chapter 6 Mm) 
Sending and Receiving Secret Messages nn 


Since ancient times, people have devised ways of sending secret messages to each 
other. Much of the original interest was for military purposes: commanders of one 
section of an army wanted to send messages to commanders of other sections of their 
army in such a way that the message could not be understood by enemy soldiers who 
might intercept it. 

Some of the current interest in secret messages is still for military and similarly 
horrible purposes. However, there are also many other kinds of situations in which 
it is important to be able to send secret messages. For example, a huge amount 
of information is communicated via the internet. It is important that some of that 
information remain private, known only to the sender and recipient. One common 
situation is making withdrawals from bank accounts over the internet. If someone 
else was able to intercept the information being sent, that interceptor could transfer 
funds from the sender’s bank account to the interceptor’s bank account. There are 
many other commercial and personal communications that are sent electronically 
that people wish to keep secret. 

“Cryptography” refers to techniques for reconfiguring messages so that they 
cannot be understood except by the intended recipient. Encrypting a message is 
the process of reconfiguring it; decrypting is the process of obtaining the original 
message from the encrypted one. For a method of cryptography to be useful, it must 
be the case that it would be virtually impossible (or at least extremely difficult) for 
anyone other than the intended recipient to be able to decrypt the messages. 

A fundamental problem is that the intended recipient must have the information 
that is needed to decrypt encrypted messages. If the sender has to send the 
decrypting information to the recipient, unintended interceptors (e.g., someone who 
wants to transfer your money to his or her bank account) might get access to the 
method of decrypting as that method is being transmitted to the intended recipient. 
It can be very difficult to get the method of decrypting to the intended recipient 
while making sure that no one can intercept it on route. 
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The techniques of encrypting and decrypting messages for a given procedure are 
called the “keys” for that procedure. There must be a “key” for encrypting messages 
and a “key” for decrypting them. 

Beginning in the 1940s, many people wondered whether there could be public 
key cryptography. That means, a method of doing cryptography that has the property 
that everyone in the world (the “public’’) can be told how to send the recipient an 
encrypted message. On the other hand, the recipient must be the only one who 
can decrypt messages sent using that procedure. That is, public key cryptography 
refers to methods of sending messages that allow the person who wishes to receive 
messages to publicly announce the way messages should be encrypted in such a 
way that only the person making the announcement can decrypt the messages. 
This seems to be impossible. If people know how to encrypt messages, won’t they 
necessarily also be able to figure out how to decrypt them, just by reversing the 
encrypting procedure? 


6.1 The RSA Method 


Several methods for public key cryptography have been discovered. To actually 
use these methods requires computing with very large numbers. Thus, the methods 
would not be feasible without computers. One method is called “RSA” after three of 
the people who played important roles in its development: Ron Rivest, Adi Shamir, 
and Leonard Adleman. The only mathematics that is required to establish that the 
RSA method works is Fermat’s Little Theorem (5.1.2). 

Here is an outline of the method. The recipient announces to the entire world the 
following way to send messages. If you want to send a message, the first thing that 
you must do is to convert the message into a natural number. There are many ways 
of doing that; here is a rough description of one possibility. Write your message out 
as sentences in, say, the English language. Then convert the sentences into a natural 
number as follows. Let A = 11, B = 12, C = 13, ..., Z = 36. Let 37 represent 
a space. Let 38 represent a period, 39 a comma, 40 a semicolon, 41 a full colon, 
42 an exclamation point, and 43 an apostrophe. If desired, other symbols could 
be represented by other two-digit natural numbers. Convert your English language 
message into a number by replacing each of the elements of your sentences by their 
corresponding numbers in the order that they appear. For any substantial message, 
this will result in a large natural number. Everyone would be able to reconstruct the 
English language message from that number if this procedure was known to them. 
For example, the sentence 


PUBLIC KEY CRYPTOGRAPHY IS NEAT. 
would be represented by the number 


263 1122219133721153537132835263025 17281 1261835371929372415 113038 
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Furthermore, if you read the rest of this chapter 
35253 1432222372 12425333733 183537 193037332528212938 


The RSA technique is a method for encrypting and decrypting numbers. Both the 
recipient and those who send messages must use computers to do the computations 
that are required; the numbers involved in any application of the technique that could 
realistically protect messages are much too large for the computations to be done by 
hand. 

RSA encryption proceeds as follows. The person who wishes to receive mes- 
sages, the recipient, chooses two very large prime numbers p and q that are different 
from each other, and then defines N to be pq. The recipient publicly announces 
the number NV. However, the recipient keeps p and gq secret. If p and g are large 
enough, it is not feasible for anyone other than the recipient to find p or g simply 
from knowing N; factoring very large numbers is an extremely difficult problem for 
even the most powerful current computers. There are some very large known prime 
numbers; such can easily be chosen so that the resulting N = pq is impossible to 
factor in any reasonable amount of time. The recipient announces another natural 
number E, which we will call the encryptor, in addition to N. Below we will explain 
ways of choosing suitable E’s. 

The recipient then instructs all those who wish to send messages to do the 
following. Write your message as a natural number as described above. Let’s say 
that M is the number representing your message. For this method to work, M must 
be less than N. If M is greater than or equal to N, you could divide your message 
into several smaller messages, each of which correspond to natural numbers less 
than NV. The method we shall describe only works when M is less than NV. 

“To send me messages,” the recipient announces to the world, “take your message 
M and compute the remainder that M® (i.e., M raised to the power E) leaves upon 
division by N, and send me that remainder.” 

In other words, to send a message M, the sender computes the R between 0 and 
N — 1 such that M= = R (mod N). The sender then sends R to the recipient. 

How can the message be decrypted? That is, how can the recipient recover the 
original message M from R? This will require finding a decryptor, which will be 
possible for anyone who knows the factorization of N as the product pq but virtually 
impossible for anyone else. We shall see that, if EF is chosen properly, there is a 
decryptor D such that for every integer L between 0 and N—1, L?? = L (mod N). 
For such a D, since R = M® (mod N), it follows that R? = M£? (mod N), and 
therefore, since M2? = M (mod N), R? = M (mod N). Thus, the recipient 
decrypts the message by finding the remainder that R? leaves upon division 
by N. 

Before discussing how to find encryptors E and decryptors D and why this 
method works, let’s look at a simple example. In this example, the numbers are 
so small that anyone could figure out what p and q are, so this example could not 
realistically be used to encrypt messages. However, it illustrates the method. 
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Example 6.1.1. Let p = 7 and q = 11 be the primes; then N = pg = 77. Suppose 
that E = 13; as we shall see, there are always many possible values for E. Below 
we will discuss the properties that E must have. There is a technique for finding D, 
based on knowing p and q, that we shall describe later; that technique will produce 
D = 37 in this particular example. 

In this example, the recipient announces N = 77 and E = 13 to the general 
public; the recipient keeps the values of p, g, and D secret. 

The recipient instructs the world how to send messages. Suppose you want to 
send the message M = 71. Following the encryption rule, you must compute 
the remainder that M£ = 71! leaves upon division by 77. This is equivalent to 
calculating 71!3 (mod 77). Let’s compute that as follows, using some of the facts 
about modular arithmetic that we learned in the previous chapters. First, 71 = —6 
(mod 77), so M£ = (—6)!3 (mod 77). Now 6° = 216 and 216 = —15 (mod 77), 
so 6° = (6°)? = (—15)* = 225 = —6 (mod 77). Therefore, 


(—6)!3 = —6- (—6)!* (mod 77) 
= —6-6!? (mod 77) 

= —6- (6°)? (mod 77) 
= —6- (—6)* (mod 77) 
= —6? (mod 77) 

15 (mod 77) 


Thus, the encrypted version of your message is 15. 

Anyone who sees that the encrypted version is 15 would be able to discover your 
original message if they knew the decryptor. But the recipient is the only one who 
knows the decryptor. 

In this special, easy, example, the recipient receives 15 and proceeds to decrypt 
it, using the decryptor 37, as follows. Your original message will be the remainder 
that 15°7 leaves upon division by 77. Compute: 157 = —6 (mod 77). Then, 157° = 
(—6)!3 (mod 77), which (as we saw above) is congruent to 15 (mod 77). Also, 
from 152 = —6 (mod 77) we obtain 15° = (—6)* = 6-6° = 6-(—15) = —90 = 64 
(mod 77). Therefore, 


15°) =. 157-158 15" God: 77) 

= 15-64- 15> (mod 77) 

= 64- 154 (mod 77) 

= 64- (157)? (mod 77) 

= 64- (—6)? (mod 77) 
(—13) - 36 (mod 77) 
—468 (mod 77) 


Of course, —468 is congruent to —468 plus any multiple of 77. Now 7- (77) = 539. 
Hence, 15°’ = —468 = —468+539 =71 (mod 77). Therefore, we have decrypted 
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the received message, 15, and obtained the original message, 71. (The number 71 
must be the original message, since it is the only natural number less than N that is 
congruent to 71 modulo N.) 


The above looks somewhat complicated. We now proceed to explain and analyze 
the method in more detail. 

For p and gq distinct primes and N = pq, we use the notation @(N) to 
denote (p — 1)(q — 1). (This is a particular case of a more general concept, 
known as the Euler @ function, that we will introduce in the next chapter.) The 
theorem that underlies the RSA technique is an easy consequence of Fermat’s Little 
Theorem (5.1.2). 


Theorem 6.1.2. Let N = pq, where p and q are distinct prime numbers, and let 
P(N) = (p — Ig — 1). Ifk and a are any natural numbers, then a - ak?) = aq 
(mod WN). 


Proof. The conclusion of the theorem is equivalent to the assertion that N divides 
the product of a and akP—@—-) _ 1, Since N is the product of the distinct primes 
p and q, this is equivalent to the product being divisible by both p and gq, for a 
natural number is divisible by both p and gq if and only if its canonical factorization 
(Corollary 4.1.2) includes both of the primes p and q. 

Consider p (obviously the same proof works for g). There are two cases. First, 
if p divides a, then p certainly divides a - (a*?-Y@-) — 1). If p does not divide 
a, then, by Fermat’s Little Theorem (5.1.2), a?-! =1 (mod p). Raising both sides 
of this congruence to the power k(q — 1) shows that a’?-D@-) = 1 (mod p). 
Thus, p divides akP-)G—-)) — 1, s0 it also divides a - (akP-DG-)) — 1). This 
establishes the result in the case that p does not divide a. Thus, in both cases, p 
divides (a - ak(P—(@-D _ a), Therefore, a - ak?) =a (mod N). Oo 


How does this theorem apply to the RSA method? We pick as an encryptor, E, 
any natural number that does not have any factor in common with @(NV). As we 
shall see in the next chapter, this implies that there is a natural number D such that 
ED is equal to the sum of | and a multiple of @(N); that is, there is a D such that 
ED = 1+ k@(N) for some natural number k. The theorem we have just proven 
shows that D is a decryptor, as follows. Suppose that M is the original message, so 
that R = M¥ (mod N) is its encryption. Since R is congruent to M” modulo N, 
R? is congruent to M£” modulo N. But ED = 1+k@(N), so R? is congruent to 
M!+k®™) modulo N. By the above theorem, M!+*?™) is congruent to M modulo 
N. (Of course, M is a natural number less than N, which uniquely determines it.) 

The explanation of how to find decryptors requires some additional mathematical 
tools that we develop in the next chapter. If N is very small, decryptors can be found 
simply by trial and error. 

A complete description of the RSA technique, including choosing encryptors 
and finding decryptors, is given in the next chapter (see The RSA Procedure for 
Encrypting Messages 7.2.5). 
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6.2 Problems 
Basic Exercises 


1. You are to receive a message using the RSA system. You choose p = 5, g = 7, 
and E = 5. Verify that D = 5 is a decryptor. The encrypted message you 
receive is 17. What is the actual (decrypted) message? 

2. Use the RSA system with N = 21 and the encryptor E = 5. 


(a) Encrypt the message M = 7. 
(b) Verify that D = 5 is a decryptor. 
(c) Decrypt the encrypted form of the message. 


3. A person tries to receive messages without you being able to decrypt them. 
The person announces N = 15 and E = 7 to the world; the person uses such 
low numbers assuming that you don’t understand RSA. An encrypted message 
R = 8 is sent. By trial and error, find a decryptor, D, and use it to find the 
original message. 


Chapter 7 ®) 
The Euclidean Algorithm en 
and Applications 


Each pair of natural numbers has a greatest common divisor; i.e., a largest natural 
number that is a factor of both of the numbers in the pair. For example, the greatest 
common divisor of 27 and 15 is 3, the greatest common divisor of 36 and 48 is 12, 
the greatest common divisor of 257 and 101 is 1, the greatest common divisor of 4 
and 20 is 4 and the greatest common divisor of 7 and 7 is 7. 


Notation 7.0.1. The greatest common divisor of the natural numbers a and b is 
denoted gcd(a, b). 


Thus, gced(27, 15) = 3, gcd(36, 48) = 12 and gcd(7, 21) = 7. 

One way to find the greatest common divisor of a pair of natural numbers is by 
factoring the numbers into primes. Then the greatest common divisor of the two 
numbers is obtained in the following way: For each prime that occurs as a factor 
of both numbers, find the highest power of that prime that is a common factor of 
both numbers and then multiply all those primes to all those powers together to 
get the greatest common divisor. For example, since 48 = 24 - 3 and 56 = 23 - 7, 
gcd(24, 56) = 23 = 8. As another example, note that gcd(1292, 14440) = 76, since 
1292 = 27-17-19 and 14440 = 23.5. 19? and 2” - 19 = 76. 

Another way of finding the greatest common divisor of two natural numbers is 
by using what is called the Euclidean Algorithm. One advantage of this method is 
that it provides a way of expressing the greatest common divisor as a combination 
of the two original numbers in a way that can be extremely useful. In particular, 
this technique will allow us to compute a decryptor for each encryptor chosen for 
RSA encrypting. As we shall see, other applications of the Euclidean Algorithm 
include a method for finding integer solutions of linear equations in two variables 
(Diophantine equations) and a different proof of the Fundamental Theorem of 
Arithmetic. 
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The Euclidean Algorithm is based on the ordinary operation of division of natural 
numbers, allowing for a remainder. As we have seen, we can express that concept of 
division as follows: if a and b are any natural numbers, then there exist nonnegative 
integers g and r such thata = bq +r and0O <r < b. (Recall that the number g is 
called the quotient and the number r is called the remainder in this equation.) If b 
divides a, then, of course, r = 0. 

Let a and b be natural numbers. The Euclidean Algorithm for finding the 
greatest common divisor of a and b is the following technique. If b = a, then 
clearly the greatest common divisor is a. Suppose that b is less than a. (If b is 
greater than a, interchange the roles of a and b in what follows.) Divide a by b 
as described above to get g and r satisfying a = bq+rwithO <r < Db. If 
r = 0, then clearly the greatest common divisor of a and b is b itself. If r is 
not 0, divide r into b, to get b = rq; +11, whereO < ry < r. If ry; = 0, 
stop here. If r; is different from 0, divide r; into r to get r = riq2 + 72, where 
0 < ro < r;. Continue this process until there is the remainder 0. (That will 
have to occur eventually since the remainders are all nonnegative integers and 
each one is less than the preceding one.) Thus, there is a sequence of equations as 
follows: 


a=bq+r 

b=rq+n 
r=rq@tr2 
r= 1293 +13 


rk—-3 = ’k-29k-1 +1 k-1 
Vk-2 = Vk-19k +k 
re-1 = Vkqk+1 


We show that r;, is the greatest common divisor of the original a and b. To see 
this, note first that 7, is a common divisor of a and b. This can be seen by “working 
your way up” the equations. Replacing rz_; by rgqx+1 in the next to last equation 
gives rp—2 = regrzidk +1k = re(Ge+19k + 1). Thus, rz divides rz_2. The equation 
for r,—3 can then be rewritten: 


re-3 = re(deridk + Dani + redesi = re (Geride + Dak—1 + ae41) 


Thus, r,—3 is also divisible by rz. Continuing to work upwards eventually shows 
that r; divides r, then b, and then a. Therefore, rz is a common divisor of a and b. 
To show that r;, is the greatest common divisor of a and b, we show that every 
other common divisor of a and b divides rz. Suppose, then, that d is a natural number 
that divides both a and b. The equation a = bg+r shows that d also divides r. Since 
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d divides both b and r, it divides r;; since it divides r and rj, it divides r2; and so 
on. Eventually, we see that d also divides r,. Hence, every common divisor of a and 
b divides rz, so rg is the greatest common divisor of a and b. 

Let’s look at an example. Suppose we want to use the Euclidean Algorithm to 
find the greatest common divisor of 33 and 24. We begin with 33 = 24-1+9. Then, 
24 = 9-2+6. Then, 9 = 6-1+3. Then, 6 = 3 - 2. Thus, the greatest common 
divisor of 33 and 24 is 3. 


Definition 7.1.1. A linear combination of the integers a and b is an expression 
of the form ax + by, where x and y are integers. We say that the integer d is a 
linear combination of the integers a and b if there exist integers x and y such that 
ax+by=d. 


Obtaining the greatest common divisor by the Euclidean Algorithm allows us to 
express the greatest common divisor as a linear combination of the original numbers, 
as follows. First consider the above example. From the next to last equation, we get 
3 = 9 —6.- 1. Substituting the expression for 6 obtained from the previous equation 
into this one gives 


3=9- (24-—9.-2)-1=9-—2449-2=9.3—24 


Then solve for 9 in the first equation, getting 9 = 33 — 24- 1, and substitute this into 
the above equation to get 3 = (33 — 24-1) -3 — 24 = 33-3 — 24-4. Therefore, 
3 = 33-3 + 24(—A4). The greatest common divisor of the numbers 33 and 24, 3, is 
expressed in the last equation as a linear combination of 33 and 24. 

The Euclidean Algorithm can always be used, as in the above example, to write 
the greatest common divisor of two natural numbers as a linear combination of those 
numbers. That is, given natural numbers a and b with greatest common divisor d, 
there exist integers x and y such that d = ax + by. This can be seen by working 
upwards in the sequence of equations that constitute the Euclidean Algorithm, as 
in the above example. The next to last equation can be used to write the greatest 
common divisor, rz, as a linear combination of rz_; and rg_2; simply solve the next 
to last equation for r;. Solving for r,_; in the previous equation and substituting 
represents r; as a linear combination of rz_2 and rz_3. By continuing to work our 
way up the equations in the Euclidean Algorithm, we eventually obtain r; as a linear 
combination of the given numbers a and b. 


7.2 Applications 


Definition 7.2.1. The integers a and b are said to be relatively prime if their only 
common divisor is 1; that is, if gcd(a, b) = 1. 


By the above-described consequence of the Euclidean Algorithm, gcd(a, b) = 1 
implies that there exist integers x and y such that ax + by = 1. This fact forms 
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the basis for a different proof of the Fundamental Theorem of Arithmetic (4.1.1). 
We begin by using this fact to prove the following lemma. (This is a restatement of 
Corollary 4.1.3; however, it is presented with a new and independent proof.) 


Lemma 7.2.2. [fa prime number divides the product of two natural numbers, then 
it divides at least one of the numbers. 


Proof. Suppose that p is prime and p divides ab. If p divides a, then we are done. 
So suppose that p does not divide a; we show that in this case p divides b. Since p 
is prime, the only possible factors that a could have in common with p are | and p. 
Therefore, a and p are relatively prime and so there exist integers x and y such that 
ax + py = 1. Multiply through by J, getting bax + bpy = b. Since p divides ab, 
it divides the left-hand side of this equation, so it must divide b. oO 


We need a slightly stronger lemma which follows easily from the above. 


Lemma 7.2.3. For any natural number n, if a prime divides the product of n natural 
numbers, then it divides at least one of the numbers. 


Proof. This is a simple consequence of the previous lemma and mathematical 
induction. The previous lemma is the case n = 2. Suppose that the result is true 
for n factors, where n is greater than or equal to 2. Assume that p is prime and 
that p divides aja2---dy+41. If p does not divide a, then by the case n = 2, p 
divides az ---dn+41. Hence, by the inductive hypothesis, p divides at least one of 
a2, A3,..-,4n+1.- oO 


We are now able to present another proof of the Fundamental Theorem of 
Arithmetic. 


Theorem 7.2.4 (The Fundamental Theorem of Arithmetic). The factorization of 
a natural number greater than 1 into primes is unique except for the order of the 
primes. 


Proof. If there were natural numbers with two distinct factorizations, then, by the 
Well-Ordering Principle (2.1.2), there would exist a smallest such natural number, 
say N. Then N = p{'! p5?--- pp’ = aa vee ae Notice that N cannot be prime, 
since there is only one way to express a prime. Since p; divides N, it divides 
q'9) °° “ge By Lemma 7.2.3, p; divides some qj. Since p; and q; are prime, 
P\ = qj;- Dividing both expressions for N by this common factor would then 
yield a smaller natural number with two distinct factorizations. This contradiction 
establishes the result. Oo 


The Euclidean Algorithm is used to find decryptors in the RSA cryptography sys- 
tem. Before explaining this in general, we illustrate it in the case of Example 6.1.1 
from the previous chapter. In that example, we started with p = 7 and g = 11, so 
that N = 77 and ¢(N) = 6- 10 = 60. We took the encryptor EF = 13. The crucial 
property of the encryptor is that it is relatively prime to (NV). That is true in this 
case; clearly the only common factor of 13 and 60 is 1. Since gcd(13, 60) = 1, the 
consequence of the Euclidean Algorithm discussed above implies that there exist 
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integers x and y such that 1 = 13x + 60y, or 13x = 1 — 60y. Note that if x and 
y satisfy this equation, then, for every m, 13(x + 60m) = 1 — 60(y — 13m), since 
this latter equation is obtained from the previous one by simply adding 13 - 60m to 
both sides of the equation. Thus, if the original x was negative, we could choose 
a positive m large enough so that x + 60m is positive. Therefore, without loss of 
generality, we can assume that x is positive, which forces y to be negative in the 
equation 13x = 1 — 60y. Replace —y by u; then 13x = 1 + 60u, with x and u 
both positive integers. We will find such x and u using the Euclidean Algorithm. 
First, however, note that any such x is a decryptor. To see this, first note that, 
as in Example 6.1.1, M!? is congruent to the encrypted version of the message 
M. Thus, the encrypted version of the message to the power x is congruent to 
(M}3)* = M13* = y!t+60u — yy. M© , which is congruent modulo 77 to M 
by Theorem 6.1.2. 

To obtain a decryptor for this example, we begin by using the Euclidean 
Algorithm to find gcd(13, 60): 


60 = 13-4+8 
13=8-1+5 
8=5-14+3 
5=3-1+2 
3=2-14+1 
2=1-2 


Thus, the greatest common divisor of 13 and 60 is 1. Of course, we knew that 
already; we chose 13 to be relatively prime to 60. The point of using the Euclidean 
Algorithm is that it allows us to express | as a linear combination of 13 and 60, as 
follows. From the above equation 3 = 2-1+ 1 we get 1 = 3—2. Since 5 = 3-1+2, 
we have 1 = 3 —-2 = 3 — (5 — 3) = 3-2 —5. Continuing by working our way up 
and collecting coefficients gives the following: 


1=3:2=5 

= (8—5)-2—5 
=8-2—-5-3 
=8-2-—(13-—8):-3 
=8-5-13-3 

= (60 — 13-4)-5— 13-3 
= 60-5 — 13-23 


Equivalently, 1 — 60-5 = —13-23. We are not done. We must find positive integers 
k and D such that 1 + 60k = 13D. For any integer m, adding —13 - 60m to both 
sides of the above equation gives 1 — 60- (5+ 13m) = 13 - (—23 — 60m). Taking 
m = —|1 in this equation gives | + 60-8 = 13 - 37. Thus, 37 is a decryptor. 

We have illustrated and proven the RSA technique. The following is a statement 
of what we have established. 
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The RSA Procedure for Encrypting Messages 7.2.5. The recipient chooses 
(very large) distinct prime numbers p and q and lets N = pq and ¢(N) = 
(p — Iq — 1). The recipient then chooses a natural number E (which we 
are calling the “encryptor” and is often called the “public exponent”) greater 
than I that is relatively prime to @(N). The pair of numbers (N, E) is called 
the “public key.” The recipient announces the public key and states that any 
message M consisting of a natural number less than N can be sent as follows: 
Compute the natural number R less than N such that M E = R (mod N). The 
encrypted message that is sent is the natural number R. The recipient decrypts the 
message by using the Euclidean Algorithm to find natural numbers D (which 
we are calling the “decryptor” and is often called the “private exponent”) 
and k such that 1+ k@(N) = ED. The pair of numbers (N, D) is called 
the “private key”; the recipient keeps D secret. The recipient then recovers the 
original message M as the natural number less than N that is congruent to MEP 
(mod N). 


The technique that we used to find decryptors can be used to solve many other 
practical problems. 


Definition 7.2.6. A linear Diophantine equation is an equation of the form ax + 
by = c, where a, b, and c are integers, and for which we seek solutions (x, y) with 
x and y integers. 


Example 7.2.7. A store sells two different kinds of boxes of candies. One kind sells 
for 9 dollars a box and the other kind for 16 dollars a box. At the end of the day, the 
store has received 143 dollars from the sale of boxes of candy. How many boxes did 
the store sell at each price? 

How can we approach this problem? If x is the number of the less expensive 
boxes sold and y is the number of the more expensive boxes sold, then the 
information we are given is 


9x + l6y = 143 


There are obviously an infinite number of pairs (x, y) of real numbers that satisfy 
this equation; the graph in the plane of the set of solutions is a straight line. However, 
we know more about x and y than simply that they satisfy that equation. We know 
that they are both nonnegative integers. Are there nonnegative integral solutions? 
Are there any integral solutions at all? Since 9 and 16 are relatively prime, the 
Euclidean Algorithm tells us that there exist integers s and t (possibly negative) sat- 
isfying 9s + 16t = 1. Multiplying through by 143 gives 9(143s) + 16(143r) = 143. 
Therefore, there are integral solutions. However, it is not immediately clear whether 
there are nonnegative integral solutions, which the actual problem requires. Let’s 
investigate. 
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We will use the Euclidean Algorithm to find integral solutions to the equation 
9s + 16t = 1. We first use the Euclidean Algorithm to find the greatest common 
divisor (even though we know it already): 


146=9-1+47 
9=7-1+2 
7=2-34+1 
2=1.-2 


Working our way back upwards to express | as a linear combination of 9 and 16 
gives 


a ee 
Sa G= 73 

— Paes 
(16—9)-4-9.-3 
ee eee 


Therefore, 9(—7) + 16-4 = 1. Multiplying by 143 yields 9(—7- 143) + 16(4- 143) = 
143. Note that 7-143 = 1001 and 4- 143 = 572. For any integer m, we can add and 
subtract 16 - 9m to the left-hand side of the above equation; thus, for every integer 
m, 


9(—1001 + 16m) + 16(572 — 9m) = 143 


This gives infinitely many integer solutions; what about nonnegative solutions? 

We require that —1001 + 16m be at least 0. That is equivalent to 16m > 1001, 
orm > ara Thus, m > 62.5625. The smallest integer m satisfying this inequality 
is m = 63. When m = 63, —1001 + 16m = 7 and 572 — 9m = 5. Thus, one pair of 
nonnegative solutions to the original equation is x = 7 and y = 5. Are there other 
nonnegative solutions? We will show that all the solutions of this equation are of the 
form x = —1001+ 16m and y = 572—9m, for some integer m (see Example 7.2.11 
below). To show that the only nonnegative solution is (7, 5) we reason as follows. 
If we take the next possible m, m = 64, then the y we get is 572 —9-64 = —4. 
Obviously, if m is even larger, 572 — 9m will be even more negative. Therefore, the 
only pair of nonnegative solutions to the original equation is (7, 5). Thus, the store 
sold 7 of the cheaper boxes and 5 of the more expensive boxes of candy. oO 


The basic theorem about solutions of linear Diophantine equations is the 
following. 


Theorem 7.2.8. The Diophantine equation ax + by = c, with a, b, and c integers, 
has integral solutions if and only if gcd(a, b) divides c. 
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Proof. Letd = gcd(a, b). If there is a pair of integers (x, y) satisfying the equation, 
then ax + by = c and, since d divides both of a and bd, it follows that d divides c. 
This proves the easy part of the theorem. 

The converse is also easy, but only because of what we have learned about the 
Euclidean Algorithm. We used the Euclidean Algorithm to prove that there exists 
a pair (s, ft) of integers satisfying as + bt = d. If d divides c, then there is a k 
satisfying c = dk. Let x = sk and y = tk. Then clearly ax + by =c. Oo 


As we’ve seen in the example where we determined the number of boxes of 
each kind of candy sold (Example 7.2.7), it is sometimes important to be able to 
determine all the solutions of a Diophantine equation. We use the very easy fact that 
(x + bm, y — am) is a solution of ax + by = c whenever (x, y) is a solution. This 
follows since a(x + bm) + b(y — am) = ax + abm + by — abm = ax + by. This 
shows that a Diophantine equation has an infinite number of solutions if it has any 
solution at all. However, in some situations, such as the problem about determining 
the number of different kinds of boxes of candy that were sold, it is important to 
have a unique solution that satisfies some other condition of the problem (such as 
requiring that both of x and y be nonnegative). Theorem 7.2.10 below precisely 
describes all the solutions of a given linear Diophantine equation. 

We require a lemma that generalizes the fact that if a prime divides a product, 
then it divides at least one of the factors (Lemma 7.2.2). 


Lemma 7.2.9. [fs divides tu and s is relatively prime to u, then s divides t. 


Proof. The hypothesis implies that there exists an r such that tu = rs. Write the 
canonical factorization of wu into primes, u = p}' p5* --- p;. Then, 


ay a2 


tp Py Py’ = 1s 

Imagine factoring both sides of this equation into a product of primes. By the 
Fundamental Theorem of Arithmetic (see 4.1.1 or 7.2.4), the factorization of the 
left-hand side into primes has to be the same as the factorization of the right-hand 
side. Since s is relatively prime to uv, none of the primes comprising s are among the 
p;. Thus, all the primes in s occur to at least the same power in the factorization of 
t, and therefore s divides t. This proves the lemma. oO 


Theorem 7.2.10. Let gcd(a, b) = d. The linear Diophantine equation ax +by = c 
has a solution if and only if d divides c. If d divides c and (xo, yo) is a solution, then 
the integral solutions of the equation are all the pairs (xo +m-5, yo-m- 5), where 
m assumes all integral values. 


Proof. We already established the first assertion, the criterion for the existence of a 
solution (Theorem 7.2.8). If (xo, yo) is a solution, it is easy to see that each of the 
other pairs is also a solution, for 


oa te +b/( “) Re eaaee st +b 
alx m:-— —m-—)=ax m:+— —m:— =ax = 
0 7 Yo ZA 0 d Yo d 0 0 c 
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All that remains to be proven is that there are no solutions other than those 
described in the theorem. To see this, suppose that (xo, yo) is a solution and that 
(x, y) is any other solution of ax + by = c. Since axo + byo = c, we can subtract 
the second equation from the first to conclude that 


a(x — x9) + b(y — yo) = 0 


Bring one of the terms to the other side and divide both sides of this equation by d 
to get 


a _b 
ha x0) = 4 (v0 y) 


Note that § and E are relatively prime. (For if e was a common factor greater than 
1, then d - e would be a common divisor of a and b greater than d.) Hence, by 
Lemma 7.2.9, 4 divides (yo — y) and B divides (x — xo). That is, there are integers 


k and/ such that yp - y = k- § andx — x9 =1- 5. Equivalently, y =yo—k-§ 
andx =xo+l- a For (x, y) to be a solution, we must have 


b a 
a(xotl- = +b(s-k- 5) = 


Thus, 


ab ba 
l- b k- = 
axo + 7d + Dyo 7 c 


Since axo + byo = c, we get !- ap —k. ba = 0. Therefore, / = k. Call this common 
value m. Then, 


Q|o 


xXx=xo+tm- 


II 


yo—m- 


This proves the theorem. Oo 


Example 7.2.11. The uniqueness of the solution to the “candy boxes problem” 
(Example 7.2.7) follows from this theorem. In that example, gcd(9, 16) = 1, so 
all the solutions are indeed of the form (—1001 + 16m, 572 — 9m). oO 


There are many other interesting applications of the theorem concerning 
solutions of linear Diophantine equations (see, for example, the problems at the 
end of this chapter). 
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Recall that we used the notation (NV) to denote (p — 1)(g — 1) when we were 
describing the RSA technique with N = pq, where p and gq were distinct prime 
numbers. This is a special case of notation for a useful general concept. 


Definition 7.2.12. The Euler ¢ function is defined for natural numbers m by: ¢(m) 
is equal to the number of integers in {1, 2, ...,} that are relatively prime to m. 


Example 7.2.13. To compute $(8), we consider the set {1, 2,3, 4,5, 6, 7, 8}. We 
get 6(8) = 4, since 1, 3, 5, and 7 are the numbers in the set that are relatively prime 
to 8. Similarly, 6(7) = 6, and @(12) = 4. Oo 


Theorem 7.2.14. If p is prime, then @(p) = p — 1. 


Proof. Since p is prime, every number in {1, 2,..., p — 1} is relatively prime to p, 
so $(p) = p— 1. o 


In discussing the RSA technique, we used the notation @(pq) = (p — D(¢g — 1) 
when p and g were distinct primes. This is consistent with the definition of @ we 
are now using. 


Theorem 7.2.15. If p and q are distinct primes, then (pq) = (p — Iq — 1). 


Proof. Suppose that p and g are primes with p less than q (since they are different, 
one of them is less than the other), and let N = pq. Clearly pq is not relatively 
prime to N. Thus, to find ¢(V) we must determine how many numbers in the set 
S= {1,2,3,...,p,...,9,-.., pq—I1} are relatively prime to N. If a number is not 
relatively prime to N, then it must be divisible by either p or g or both. However, 
an element k of S cannot be divisible by both p and gq. For if it was, the canonical 
factorization (Corollary 4.1.2) of k would show that it was divisible by pq. This is 
impossible since every number in S is less than pq. 

There are a total of pg — 1 numbers in S; how many multiples of p are there in 
S? The set S contains p, 2p, 3p, and so on, up to (q — 1)p, since qp is not in S. 
Thus, there are g — 1 multiples of p in S. Similarly, there are p — 1 multiples of g 
in S. Therefore, there is a total of (¢ — 1) + (p — 1) = p+q — 2 numbers in S that 
are not relatively prime to N. Since there are pq — 1 numbers in S, the number of 
numbers in S that are relatively prime to N is 


pq-1-—(p+q-2)=pq-—p-qt+l=p-VD@-)) 


Therefore, (NV) = (p — 1)(q - 1). oO 


There is a formula for ¢(m) for any natural number m greater than 1, in terms of 
the canonical factorization of m into a product of primes (see Problem 27 at the end 
of this chapter). 

Fermat’s beautiful theorem that a?—~! = 1 (mod p) (5.1.2) (for primes p and 
natural numbers a that are not divisible by p) can be generalized to composite 
moduli. We require the following lemma that generalizes Theorem 5.1.1. 
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Lemma 7.2.16. [fa is relatively prime to m and ax = ay (mod m), thenx = y 
(mod m). 


Proof. We are given that m divides ax — ay. That is, m divides a(x — y). By 
Lemma 7.2.9, m divides x — y. Thus, x = y (mod m). ia 


Euler’s Theorem 7.2.17. [fm is a natural number greater than I and a is a natural 
number that is relatively prime to m, then a? = 1 (mod m). 


Proof. The proof is very similar to the proof of Fermat’s Little Theorem (5.1.2). 
Let S = {r1, 2, ..-, Tg m)} be the set of numbers in {1, 2, ..., m} that are relatively 
prime to m. Then let T= {ar,, arz,..., argqm)}. Clearly, no two of the numbers in 
S are congruent to each other, since no two of them have the same remainder when 
divided by m. Note also that no two of the numbers in T are congruent to each other, 
since ar; = ar; (mod m) would imply, by Lemma 7.2.16, that r; = rj (mod m). 
Moreover, each ar; is relatively prime to m and therefore so is any number that ar; 
is congruent to. Thus, the numbers in {ar), ar2, ..., Arg(m)} are congruent, in some 
order, to the numbers in {r},72,...,/¢(n)}- It follows, as in the proof of Fermat’s 
Little Theorem, that the product of all the numbers in 7 is congruent to the product 
of all the numbers in S. That is, 


G+1{+A+12*++A+Tg(m) =1112°**Tdcm) (mod m) 
Since rjr2---1gqn) is relatively prime to m, we can divide both sides of this 
congruence by that product (see Lemma 7.2.16) to get a?) = 1 (mod m). Oo 
Fermat’s Little Theorem is a special case of Euler’s. 


Corollary 7.2.18 (Fermat’s Little Theorem). /f p is a prime and p does not 
divide a, then a?~! = 1 (mod p). 


Proof. Since p is prime, the fact that p does not divide a means that a and p are 
relatively prime. Also, @(p) = p — 1. Thus, Fermat’s Little Theorem follows from 
Euler’s Theorem (7.2.17). oO 


7.3 Problems 


Basic Exercises 


1. Find the greatest common divisor of each of the following pairs of integers in 
two different ways, by using the Euclidean Algorithm and by factoring both 
numbers into primes: 


(a) 252 and 198 
(b) 291 and 573 
(c) 1800 and 240 
(d) 52 and 135 
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. For each of the pairs in Problem 1| above, write the greatest common divisor as 


a linear combination of the given numbers. 


. Find integers x and y such that 3x — 98y = 12. 
. (a) Find a formula for all integer solutions of the linear Diophantine equation 


3x +4y = 14. 
(b) Find all pairs of natural numbers that solve the above equation. 


5. Let @ be Euler’s ¢@ function. Find: 
(a) (12) (e) 6(97) 
(b) $(26) (f) (73) 
(c) (21) (g) @(101 - 37) 
(d) $(36) (h) 63!) 

6. Use the Euclidean Algorithm to find the decryptors in Problems 1, 2, and 3 in 
Chapter 6. 

Interesting Problems 


de 


13. 


14. 


Use the Euclidean Algorithm (assisted by a calculator) to find the greatest 
common divisor of each of the following pairs of natural numbers: 


(a) 47,295 and 297 
(b) 77,777 and 2,891 


. Find the smallest natural number x such that 24x leaves a remainder of 2 upon 


division by 59. 


. A small theater has a student rate of $3 per ticket and a regular rate of $10 per 


ticket. Last night $243 was collected from the sale of tickets. There were more 
than 50 but less than 60 tickets sold. How many student tickets were sold? 


. A liquid comes in 17 liter and 13 liter cans. Someone needs exactly 287 liters of 


the liquid. How many cans of each size should the person buy? 


. Let a, b, and n be natural numbers. Prove that if a” and b” are relatively prime, 


then a and b are relatively prime. 


. Let a, b, m, and n be natural numbers with m and n greater than 1. Assume that 


m and n are relatively prime. Prove that if a = b (mod m) anda = b (mod n), 
then a = b (mod mn). 
Let a and b be natural numbers. 


(a) Suppose there exist integers m and n such that am + bn = 1. Prove that a 
and b are relatively prime. 
(b) Prove that 5a+2 and 7a+3 are relatively prime for every natural number a. 


Let p be a prime number. Prove that @(p*) = p? — p. 
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15. The public key N = 55 and EF = 7 is announced. The encrypted message 5 is 
received. 

(a) Find a decryptor, D, and prove that D is a decryptor. 
(b) Decrypt 5 to find the original message. 

16. Find a multiplicative inverse of 27? modulo 9. 

17. Prove that a has a multiplicative inverse modulo m if and only if a and m are 
relatively prime. 

Challenging Problems 

18. Suppose that a and b are relatively prime natural numbers such that ab is a 
perfect square. Show that a and b are each perfect squares. 

19. Show that if m and n are relatively prime and a and b are any integers, then 
there is an integer x that simultaneously satisfies the two congruences x = a 
(mod m) and x = b (mod n). 

20. Generalize the previous problem as follows (this result is called the Chinese 

Remainder Theorem): 
If {m1, m2, ..., mg} is a collection of natural numbers greater than 1, each pair 
of which is relatively prime, and if {a1, a2, ..., ag} is any collection of integers, 
then there is an integer x that simultaneously satisfies all of the congruences 
x =a; (mod m;). Moreover, if x; and x2 are both simultaneous solutions of 
all of those congruences, then x} = x2 (mod mimz2---mx). 

21. Let p be an odd prime and let m = 2p. Prove that a’”~! = a (mod m) for all 
natural numbers a. 

22. Leta and b be relatively prime natural numbers greater than or equal to 2. Prove 
that a? + 5? = 1 (mod ab). 

23. Suppose that a, b, and c are each natural numbers. Prove that there are at most 
a finite number of pairs of natural numbers (x, y) that satisfy ax + by =c. 

24. Show that m is prime if there is an integer a such that a”~! = 1 (mod m) and 
a* £1 (mod m) for every natural number k < m — 1. 

25. Suppose that a and m are relatively prime and that k is the smallest natural 
number such that a* is congruent to | modulo m. Prove that k divides ¢(m). 

26. For p a prime and k a natural number, show that ¢(p") = p* — p*!. 

27. (Very Challenging) If the canonical factorization of the natural number n into 


primes is 


ky ky k 
N=P, °-Py-°° Pr 


prove that 


k ki-1 k ko-1 _ 
o(n) = (pt! = Pr ) : (PP -_ Py ). oy (pie 7 eg i) 


Chapter 8 Mm) 
Rational Numbers and Irrational ml 
Numbers 


The only numbers that we have discussed so far are the “whole numbers;” that 
is, the integers. There are many other interesting things that can be said about the 
integers, but, for now, we consider other numbers, the rational numbers, also known 
as “fractions,” and then the real numbers. 


8.1 Rational Numbers 


Definition 8.1.1. A rational number is a number of the form a: where m and n are 
integers andn + 0. 


: 3-7 12 1 2 
Some examples of rational numbers are qe ays Saee a) and re 
Wait a minute. Are 5 and ; different rational numbers? They are not; they are two 
; : : . 2 _ 1-7 _ 7 
different expressions representing the same number. Similarly 7 = 7, = = =, 


g = 3 and so on. The condition under which two different expressions as quotients 


of integers represent the same rational number is the following. 


Definition 8.1.2. The rational number an is equal to the rational number eS when 
minz = m2n4. 
1 
2 > 
number that could also be denoted , =, and so on. 

Why don’t we allow 0 denominators in the expressions for rational numbers? If 


we did allow 0 denominators, the arithmetic would be very peculiar. For example, u 


would equal =, since 7-0 = —12-0. In fact, we would have 5 = ; for all integers 
a and b. It is not at all useful to have such peculiarities as part of our arithmetic, so 


we do not allow 0 to be a denominator of any rational number. 


Thus, when we use the expression 5, we recognize that we are representing a 
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Definition 8.1.3. The expression ” for a rational number is said to be in lowest 
terms if m and n are relatively prime. 


Note that a representation of a rational number in lowest terms can be obtained 
by starting with any representation of the rational number and “dividing out” all the 
common factors of the numerator and the denominator. 


Notation 8.1.4. The set of all rational numbers is denoted by Q. 


The operations of multiplication and addition of rational numbers can be defined 
in terms of the operations on integers. 


om ma m2 


and a , denoted - 
2, 


Definition 8.1.5. The product of the rational numbers 
my m2 


a as is the rational number 


or simply 


m {m2 


njn2 


The sum of the rational numbers ~~ tl + and ra is the rational number 


my n m2 mjnz+mzn| 


n\ n2 nyn2 


We can think of the integers as the rational numbers whose denominator is 1; 
we invariably write them wae the denominator. For example, we write — 4 for 

= on also, of course, for =, and so on). In particular, we write 0 for ? and 
1 for ! z- Note that, from Definition 8.1.5, 0 and 1 are, respectively, eddie and 
multiplicative identities for the rational numbers, as they are i the integers. That 
is, 7 +0 = "and “-1 = %, for every rational number “. Also note that, as 
is the case with the set of inteoers, every rational number has an additive inverse: 
myo 2=9, 


Definition 8.1.6. A multiplicative inverse for the number x is a number y such that 
xy=1. 

Of course, 0 has no multiplicative inverse, since 0 times any number is 0. If x and 
y are both integers and xy = 1, then x and y must both be | or —1. Hence, the only 


integers that have multiplicative inverses within the set of integers are the numbers 
1 and —1. In the set of rational numbers, the situation is very different. 


Theorem 8.1.7. [f @ is a rational number other than 0, then ™ has a multiplicative 
inverse. 

m nN 3 : mn __ 
ee If | #0, then m ig 0. Therefore, > is also a rational number and 7 = 


m= o> "1, Therefore, = ; is a multiplicative inverse for * oO 
nam 
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8.2 Irrational Numbers 


In a sense, all actual numerical computations, by human or electronic computers, 
are done with rational numbers. However, it is important, within mathematics 
itself and in using mathematics to understand the world, to have other numbers 
as well. 


Example 8.2.1. Suppose that you walk one mile due east and then one mile due 
north. How far are you from your starting point? The straight line from your 
starting point to your final position is the hypotenuse of a right triangle (see 
Definition 11.3.2) whose legs are each one mile long. The length of the hypotenuse 
is the distance that you are from your starting point. If x denotes that distance, then 
the Pythagorean Theorem (see | 1.3.6) tells us that x7 = 2. 

It is obviously useful to have some number that denotes that distance. Is there 
a rational number x such that x? = 2? This question can be rephrased: are there 
integers m and n with n ¥ O such that (zy = 2? This, of course, is equivalent 
to the question of whether there are integers m and n different from 0 that satisfy 
the equation m? = 2n?. This is a very concrete question about integers; what is the 
answer? 


Theorem 8.2.2. There do not exist integers m and n withn 4 O such that (=) =2. 


Proof. If there exist any such m and n, then, of course, there would exist such 
m and n that are relatively prime. We will show that this assumption leads to a 


contradiction. From (@) = 2, we get m* = 2n*. The equation m2 = 2n? 
implies that m? is an even number, since it is the product of 2 and another number. 
What about m itself? If m were odd, then m — 1 would have to be even, so 
m — 1 = 2k for some integer k, or m = 2k + 1. It would follow from this that 
m2 = (2k + 1)* = 4k? +.4k +1 = 2(2k2 + 2k) + 1, which is an odd number (since 
it is | more than a multiple of 2). Thus, if m were odd, m? would have to be odd. 
Since m2 is even, we conclude that m is even. 

We now proceed to prove that n is also even. We know that m = 2s for some 
integer s, from which it follows that m? = 4s?. Substituting 4s? for m? in the 
equation m? = 2n? gives 4s? = 2n?, or 2s* = n. Thus, n” is an even number and, 
reasoning as we did above for m, it follows that n itself is an even number. 


Therefore, if (= = 2, then m and nv are both divisible by 2. This contradicts 


the assumption that m and n are relatively prime. oO 
We have proven that there is no rational number that satisfies the equation 
x* = 2. Is there any number that satisfies this equation? It would obviously 


be very important to have such a number, for the purpose of specifying the 
distance you would be from your starting point in Example 8.2.1 and for many 
other purposes. Mathematicians have developed what are called the real numbers; 
the real numbers include numbers for all possible distances. The real num- 
bers can be put into correspondence with the points on a line by labeling one 
point “O” and marking points to the right of 0 with the distances that they are 
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from 0 (using any fixed units). Points on the line to the left of O are labeled 
with corresponding negative real numbers. The resulting real number line looks 
like 


The set of real numbers and the arithmetical operations on them can be precisely 
constructed in terms of rational numbers. In fact, there are several ways to do that. 
None of the ways of constructing the real numbers in terms of the rational numbers 
are easy; they all require substantial development. There are two main approaches, 
one using Cauchy sequences and the other using Dedekind cuts. The Dedekind cuts 
approach is outlined in Problem 15 at the end of this chapter. For the present, we 
simply assume that the real numbers exist and that the arithmetical operations on 
them have the usual properties. 


Notation 8.2.3. The set of all real numbers is denoted by R. 


It can be shown that there is a positive real number x such that x7 = 2. This 
1 
number is denoted /2 or 22. 


Definition 8.2.4. For y and n natural numbers, the n™ root of y is defined to be the 
1 
positive real number x such that x” = y. This is denoted either y* or 2/y. More 


m 


generally, yn is defined to be ( yn)”. It can be shown that, for each y, m andn, y” 
defines a unique real number. 


Definition 8.2.5. A real number that is not a rational number is said to be irrational. 


Theorem 8.2.2 shows that /2 is not a rational number and thus can be rephrased 
as follows. 


Theorem 8.2.6. The number V2 is irrational. 


The symbol \/3 represents the positive real number satisfying (V3)? = 3;is /3 
irrational too? 
We can establish a more general result. 


Theorem 8.2.7. If p is a prime number, then ./p is irrational. 


Proof. The proof will be similar to that of the special case p = 2. Suppose that 7 


is a fraction, written in lowest terms, satisfying (2) = p. Then m? = pn’. Since 
m? = pn’, p divides m”. Thus, p divides the product m - m, from which it follows 
that p divides m (see Corollary 4.1.3). Therefore, there is an integer s such that 
m = ps, which gives (ps)* = pn?. Dividing both sides of this equation by p gives 
ps? =n. Thus, p divides the product n - n and we conclude that p divides n. But 
this contradicts the fact that a is written in lowest terms. Therefore, ./p cannot be 
rational. Oo 
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Of course, some natural numbers do have rational square roots. For example, 
V1 = 1, V4 = 2, and /289 = 17. What about /6? More generally is there a 
natural number m such that ./m is rational but ./m is not an integer? 


Theorem 8.2.8. Jf the square root of a natural number is rational, then the square 
root is a natural number. 


Proof. Assume that N is a natural number and that the square root of N is rational; 


that is, we can write WN = = where the fraction 3 is written in lowest terms. Then 


N= ( a and so a? = Nb*. If p is a prime number that divides b, then p divides 
a’. Hence, p divides a by Corollary 4.1.3. But this is impossible, since a and b have 
no common factors. Thus, b is a natural number that is not divisible by any prime 
number. In other words, b = 1. Therefore, JN = a, a natural number. oO 


A natural number that is the square of a natural number is said to be a perfect 
square. The canonical factorizations of perfect squares have a distinctive form. 


Theorem 8.2.9. A natural number other than I is a perfect square if and only if 
every prime number in its canonical factorization occurs to an even power. 


Proof. Let n be a natural number. If the canonical factorization of n (see Corol- 
lary 4.1.2) isn = pi! p;’--+ pe, then n? = ae ie ves a The uniqueness 
of the factorization into primes implies that this expression is the canonical 
factorization of n*. All the exponents are obviously even. This proves that the 
square of every natural number has the property that every exponent in its canonical 

ae : , é 2 2a; 2a. 2a 
factorization is even. The converse is even easier. For if m = pj} ‘Pp, “+++ Py; 


then obviously m = n*, where n = p{'p5?--- pt. o 
We can use canonical factorizations to study other roots as well. 
Example 8.2.10. The number ¥/4 is irrational. 


Proof. If J4= “ with m and n integers, then 4n> = m?. Write this equation in 
terms of the canonical factorizations of m and n, getting 


3 
-\3 
A(py' py” +++ Pr’) = (a'al?---a8) 
So, 


2 3 3a2 3a, _ 31 3, 3Bs 
2 Py! py” “ne =) '93”---qi? 


The prime 2 must occur to a power that is a multiple of 3, since every prime on the 
right-hand side of this equation occurs to such a power. On the other hand, 2 occurs 
on the left-hand side of the equation to a power that is two more than a multiple 
of 3. The uniqueness of the factorization into primes implies that no such equation 
is possible. Oo 
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Example 8.2.11. The number V3 + V5 is irrational. 


Proof. Suppose that /3 + /5 = r, with r a rational number. Then /3 = r — V5. 
Squaring both sides of this equation gives 


3=(r—V3) =r? -2W3r +5 


From this it follows that 2,/5r = r2+2o0r./5 = ¢ 72 But r rational implies that 


2r 
r — is rational, which contradicts the fact that 5 is irrational (Theorem 8.2.7). O 


There are many situations in which it is important to solve various kinds of 
equations. In particular, polynomial equations arise quite frequently. 


Definition 8.2.12. A polynomial with integer coefficients is an expression of the 
form 


i” PG an” As ae ag 


where n is a nonnegative integer and the a; are integers with a, different from 0 
(we also include “constant polynomials”; that is, polynomials where n = 0 and ao 
is any integer). The number xo is a root (or zero) of a polynomial if the value of the 
polynomial obtained by replacing x by xo is 0. 


Example 8.2.13. The polynomial x> + x — 1 has no rational roots. 


Proof. Suppose that @ is a rational root, where “ is written in lowest terms. Since 
* is aroot, substituting * into the polynomial yields (zy +" —1=0. Multiplying 
both sides by n° gives m> + mn* — n° = 0, or m(m* + n4) = n°. It follows that 
every prime divisor of m is a divisor of n> and, hence, also of n. Since m and n are 
relatively prime, this implies that m has no prime divisors. Thus, m is either 1 or —1. 
Similarly, the above equation yields m> = n(n* — mn?) from which it follows that 
every prime divisor of n divides m. Thus, n does not have any prime divisors, so 
is either | or —1. Therefore, the only possible values of * are 1 or —1. That is, the 
only possible rational roots of the polynomial are 1 and —1. However, substituting 
1 and —1 for x does not yield 0. Therefore, neither 1 nor —1 is a root. Thus, the 
polynomial does not have any rational roots. oO 


There is a general theorem, whose proof is similar to the above example, that is 
often useful in determining whether or not polynomials have rational roots and may 
also be used to find such roots. 


The Rational Roots Theorem 8.2.14. [f is a rational root of the polynomial 
arx*® + ap_yxk-! +--+ + ayx + ag, where the aj; are integers and m and n are 
relatively prime, then m divides ag and n divides ax. 
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Proof. Assuming that " is a root gives 
m\k m\k-1 m 
ax (=) +ax-1 (=) +++-+a1 (=) +a) =0 
n n n 


Multiplying both sides of this equation by n* produces the equation 
aym* + ay_ym*—'n feeet aymn*—! + agn* =0 


It follows that 


kD ayn") = gga 


m (xm! + agp_\m 
Since m and n are relatively prime, m and né are also relatively prime. On the 
other hand, m divides —agn*. Thus, by Lemma 7.2.9, m divides ag. Similarly, 


aym* =— (ax-1m''n fee aymnk—! + agn*) 
so, 


azm* =-n (ax—m'“! 5 ie aymnk-2 + agn*") 


Since n is relatively prime to m and n divides agm*, it follows (by Lemma 7.2.9) 
that n divides ax. This proves the theorem. oO 


Example 8.2.15. Find all the rational roots of the polynomial 2x? — x? + x — 6. 


Proof. By the Rational Roots Theorem (8.2.14), every rational root 7 in low- 


est terms has the property that n divides 2 and m divides 6. Thus, the only 
possible values of n are 1,—1,2,—2, and the only possible values of m are 
m 


6, —6, 3, —3, 2, —2, 1, —1. The possible values of the quotient = are therefore 
6, —6, 3, —3, 2, —2, 3, 3, 1,-1, 5s 5. We can determine which of these pos- 
sible roots actually are roots by simply substituting them for x and seeing if the 


result is 0. In this example, the only rational root is 3. oO 


The following is a question with an interesting answer: Do there exist two 
irrational numbers such that one of them to the power of the other is rational? That 
is, can x” be rational if x and y are both irrational? A natural case to consider is 


that of a In fact, however, it is not at all easy to determine whether or not 
a is rational. Nonetheless, this example can still be used to prove that the 


. : : oe ‘ 
general question has an affirmative answer, as follows. Either Gay is rational or 
it is irrational. If it is rational, it provides an example showing that the answer to the 


question is affirmative. If Way is irrational, let x = (a and y = /2. Then 
x is an irrational number to an irrational power. But 


v= (Way) = 9)" = (Shes 


70 
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This gives an affirmative answer in this case as well. In other words, (3) 
answers our original question, whether it itself is rational or irrational. In fact, 
(/3)¥2 is irrational, as follows from the Gelfond—Schneider Theorem, whose proof 
is way beyond the level of this book. 


8.3. Problems 


Basic Exercises 


CNY DAKWKR WN 


. Use the Rational Roots Theorem (8.2.14) to find all rational roots of each of the 


following polynomials (some may not have any rational roots at all): 
(a) x7 +5x+2 

(b) 2x° — 5x? + 14x — 35 

(c) x9 —x +1 


. Show that Py bet is irrational. 


1 


. Show that ne is irrational. 


Is the sum of an irrational number and a rational number always irrational? 


. Is an irrational number to a rational power always irrational? 
. Is the sum of two irrational numbers always irrational? 


_ Is ¥V49 + 1 irrational? 
. If y is irrational and x is any rational number other than 0, show that xy is 


irrational. 


Interesting Problems 


9. Determine whether each of the following numbers is rational or irrational and 


prove that your answer is correct: 


2 v7 
ne ne! 
5 63 
(c) 4 - V28 (g) NEY 


10. Prove that FY 3 + 11 is irrational. 
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Challenging Problems 


11. 


12. 


13. 


14. 


15. 


Prove that the following numbers are irrational: 


(a) J5+V7 d@) V¥34+V¥54+/7 
(b) ¥4+ V'10 @) V3-8 
(c) J54+ V3 


Suppose that a and b are odd natural numbers and a? + b” = c’. Prove that c is 
irrational. 

Let k be a natural number. Prove that if the k"* root of a natural number is a 
rational number, then the k"" root is a natural number. 

Prove that if a and b are natural numbers and n is a natural number such that 
nb is rational, then n> is a natural number. 

(Very challenging.) In this problem, we outline the Dedekind cuts approach to 
constructing the real numbers. In this approach, real numbers are defined as 
certain kinds of sets of rational numbers. The definition is the following. A real 
number is a nonempty proper subset of the set of rational numbers that does not 
have a greatest element and has the property that if a rational number f is in the 
set, then so are all rational numbers less than t. (A “proper subset” is a subset 
which is not the whole set.) 


(a) Each rational number must, of course, also be a real number; i.e., repre- 
sentable as such a set of rational numbers. If r is a rational number, the 
representation of r as a real number is as the set of all rational numbers that 
are less than r. Prove that such a representation is a real number according 
to the definition given above. 

If S and 7 are real numbers as defined above, then S + 7 is defined to be 

the set of all s+ ¢ with s in S and t in 7. Prove that S+ 7 is a real number 

(i.e., has the above properties). 

Prove that addition of real numbers as defined above is commutative; that 

is, S+ 7 =7 +S for all real numbers S and 7. 

Prove that addition of real numbers as defined above is associative; that is, 

(S, + S2) + S3 = Sj + (S2 + S3), for all real numbers S;, S2, and S3. 

If S is areal number, define —S to be the set of all rational numbers t such 

that —t is not in S and —t is not the smallest rational number that is not in 

S. Prove that —S is a real number whenever S is a real number. 

(f) Let O denote the real number corresponding to the rational number 0 (i.e., 
the set of all x in Q such that x is less than 0). Prove that S + O = S, for 
every real number S. 

(g) Prove that S + (—S) = O, for every real number S. 

(h) We say that the real number S is positive if S contains a rational number 
that is greater than 0. If S and 7 are positive real numbers, then the product 
ST is defined to be the union of the set of all rational numbers that are less 
than or equal to 0 together with the set of all rational numbers of the form 


(b 


ma 


(c 


Ne 


(d 


wm 


(e 


wm 
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st, where s is a positive number in S and f is a positive number in 7. Prove 
that the product of two positive real numbers is a real number. 

If S is a real number, define |S| to be S if S is a positive real number and 
—S otherwise. Say that a real number is negative if it is not positive and is 
not O. Prove that |S| is positive for all S not equal to O. 

If S and 7 are real numbers, define the product ST to be —(|S||7 |) if one is 
negative and the other is positive, to be |S||7| if both are negative, and to be 
O if either is O. Prove that multiplication of real numbers is commutative; 
that is, ST = 7S, for all real numbers S and 7. 

Let Z denote the real number corresponding to the rational number 1. Prove 
that the product of Z and S is S, for every real number S. 

For a positive real number S, define s to be the union of the set of rational 
numbers that are less than or equal to 0 and the set of rational numbers t 
such that 7 is not in S and 1 is not the smallest rational number not in S. 
Prove that s is a real number whenever S is a positive real number. 

For S a negative real number, define s to be Sas For S any real number 


other than O, prove that the product of S and s is Z. 

Prove that multiplication of real numbers as defined above is associative; 
that is, (S1S2)S3 = S1(S2S3), for all real numbers S, S2, and S3. 

Prove that multiplication of real numbers is distributive over addition; that 
is, Sj (S2 + 83) = S| S2 + S183, for all real numbers S1, Sz, and S3. 
(Existence of the square root of 2.) Let U/ denote the union of the set of 
negative rational numbers and the set of all rational numbers x such that x” 
is less than 2. Prove that // is a real number and that the product UU is the 
real number corresponding to the rational number 2. 


A very nice and complete exposition of the Dedekind cuts construction of the 
real numbers can be found in “Calculus” by Michael Spivak (Publish or Perish, 
Inc., Houston, Texas), which also contains a beautiful treatment of the principles 
of calculus. 


Chapter 9 Mm) 
The Complex Numbers en 


The set of real numbers is rich enough to be useful in a wide variety of situations. 
In particular, it provides a number for every distance. There are, however, some 
situations where additional numbers are required. 


9.1 What is a Complex Number? 


Let’s consider the problem of finding roots for polynomial equations. Recall that 
polynomials are expressions such as 7x? + 5x — 3, and /2x? + 3x, and x’ — 1. The 
general definition is the following. 


Definition 9.1.1. A polynomial is an expression of the form 


G2 Haan bss ay 


where 7 is a natural number and the a; are numbers with a, + 0. We also allow 
constant polynomials; i.e., expressions that are just a single number ao. The a; are 
called the coefficients of the polynomial. The natural number n, the highest power to 
which x occurs in the polynomial, is called the degree of the polynomial. A constant 
polynomial is said to have degree 0. 


Note that in the definition of polynomial we used x as the variable; this is very 
standard. However, it is often the case that other variables are used as well. For 
example, z? — 4z + 3 would be a polynomial in the variable z. 

A polynomial defines a function; whenever a specific number is substituted for x, 
the resulting expression is a number. The values of x for which the polynomial is 0 
have special significance. 
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Definition 9.1.2. A root or zero of the polynomial a,x" +dy,—,x"~!+---+a,x+ag 
is a number that when substituted for x makes 


Thee. 4ayx +a =0 


Anx” + ayn—1x" 

For example, 2 is a root of the polynomial x” — 4, 3 is a root of the polynomial 
5x* — 2x — 39, —i is a root of the polynomial 5x + 7, and so on. 

A very natural question is: Which polynomials have roots? All polynomials 
of degree 1 have roots: the polynomial a,x + dao has the root a What about 
polynomials of degree 2? A simple example is the polynomial x? + 1. No real 
number is a root of that polynomial, since x? is nonnegative for every real number 
x, and therefore x* + 1 is strictly greater than 0 for every real number x. If the 
polynomial x* + 1 is to have a root, it would have to be in a larger number system 
than that of the real numbers. Such a system was invented by mathematicians 
hundreds of years ago. 

We use the symbol i to denote a root of the polynomial x? + 1. That is, we define 
i? to be equal to —1. We then combine this symbol i with real numbers, using 
standard manipulations of algebra in the usual ways, to get the “complex numbers.” 
The definition is the following. 


Definition 9.1.3. A complex number is an expression of the form a + bi where a 
and b are real numbers. The real number a is called the real part of a + bi and the 
real number b is called the imaginary part of a+ bi. We sometimes use the notation 
Re(z) and Im(z) to denote the real and imaginary parts of the complex number z, 
respectively. Addition of complex numbers is defined by 


(a+ bi)+(c+di)=(a+c)+(b6+a)i 
Multiplication of complex numbers is defined by 


(a + bi)(c + di) = ac +. adi + bic + bdi? 
= ac +bdi* + (ad + be)i 
= (ac — bd) + (ad + be)i 


where we replaced i* by —1 to get the last equation. 


Example 9.1.4. 
(6+ 21) + (-4+ 51) =2+4+7i 


(-Vi2 + Voi) + 442i) = (-Vi2 +4) +(vo+z)i 


(7+2i)(3—4i) = 214+.6i —28i 812 = 21-22) -8(—1) = 21+8—22i = 29—22i 
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Notation 9.1.5. The set of all complex numbers is denoted by C. 


We use the symbol 0 as an abbreviation for the complex number 0 + Oi. More 
generally, we use a as an abbreviation for the complex number a + Oi. Thus, every 
real number is also a complex number. Similarly, we use bi as an abbreviation for 
the complex number 0 + bi. When r is a real number, then r(a + bi) is simply 
ra+rbi. 

Note that every complex number has an additive inverse (i.e., a complex 
number that gives 0 when added to the given number). For example, the additive 
inverse of —7 + J/2i is 7 — /2i. In general, the additive inverse of a + bi is 
-—a+(-b)i. 


Definition 9.1.6. The number a — bi is called the complex conjugate of the number 
a+ bi. The complex conjugate of a complex number is often denoted by placing a 
horizontal bar over the complex number: 


a+bi=a-— bi 


Example 9.1.7. The complex conjugate of 2 + 3i is 2 — 3i, or 2+ 3i = 2 —3i. 
Similarly, —/3 — 5i = —J/3 + 5i, and 9 = 9. 

The product of a complex number and its conjugate is important. 
Theorem 9.1.8. For any complex number a + bi, (a + bi)(a — bi) = a? + b?. 
Proof. Simply multiplying gives the result. Oo 


Definition 9.1.9. The modulus of the complex number a + bi is Va? + b?; it is 
often denoted |a + bi]. 


Thus, (a + bi)(a + bi) = |a + bil?. 

Do complex numbers have multiplicative inverses? That is, given a+ bi, is there 
a complex number c + di such that (a + bi)(c + di) = 1? Of course, the complex 
number 0 cannot have a multiplicative inverse, since its product with any complex 
number is 0. What about other complex numbers? 

Given a complex number a + bi, let’s try to find a multiplicative inverse c + di 
for it. Suppose that (a + bi)(c + di) = 1. Multiplying both sides of this equation by 
a + bi and using the fact that (a + bi)(a+bi) = a? +b? yields (a* +b*)(c+di) = 
a—bi. Since a* +b? is a real number, this implies (unless a+b? = 0) thatc+di = 
4p _ api. (Note that if a2+b? = 0, then a = O and b = 0, so the number a+bi 
is 0.) Therefore, if a + bi has a multiplicative inverse, that multiplicative inverse 
must be FBR - epi . In fact, as we now show, that expression is a multiplicative 


inverse for a + bi. 


Theorem 9.1.10. [fa+ bi 4 0, then Fae = api is a multiplicative inverse for 
a+ bi. 
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Proof. We verify this by simply multiplying 


(a + bi) a b a? p be ab a ab 

a i)- iJj= i i 
a2 + b2 a2 + b2 az + b2 a2 + b2 a2 + b2 az + b2 

which simplifies to othe a o 


As with real numbers, the multiplicative inverse of the complex number a + bi 


. 1 
is often denoted aa 


9.2 The Complex Plane 


It is very useful to represent complex numbers in a coordinatized plane. We let the 
complex number a+ bi correspond to the point (a, b) in the ordinary xy-plane. Note 
that the modulus |a + bi| is the distance from (a, b) to the origin. We will also use 
the angle that the line from the origin to (a, b) makes with the positive x-axis. 

In day to day life, angles are usually eae degrees: a right angle is 90°, 


a straight angle is 180°, and an angle of 37° is 7g5 of a straight angle. For doing 


mathematics, however, it is often more convenient to measure angles differently. 


Definition 9.2.1. The radian measure of the angle 6 is the length of the arc of 
a circle of radius 1 that is cut off by an angle @ at the center of the circle. (See 
Figure 9.1.) 


a 


/ 


Fig. 9.1 The radian measure of an angle 


Thus, since a circle of radius | has circumference 271, the radian measure of a 
right angle is >, of a straight angle is 2, of an angle of 60° is 7, and so on. Note 
that 27 is a full revolution. We will use the radian measure of angles for the rest of 
this chapter. 
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Definition 9.2.2. For a complex number a+ bi other than 0, the argument of a+ bi 
is the angle from the positive x-axis in a counterclockwise direction to the line from 
(0, 0) to (a, b). For any integer k, the angle 6 + 27k measured from the positive 
x-axis ends up at the same position as 0. Hence, if 6 is an argument of a given 
complex number, and k is any integer, then 0 + 27k is also an argument of the 
complex number. We define the argument of 0 to be 0. 


We require the basic properties of the trigonometric functions sine and cosine. 

If the complex number a + bi has modulus r and argument 0, then a = rcos90, 
and b = rsin@. To see this, first consider the case where both a and b are greater 
than 0, which is equivalent to saying that 0 < @ < 4. Then the situation is as in 
Figure 9.2. The fact that the cosine of an angle in a right triangle is the length of its 
adjacent side divided by the length of its hypotenuse gives cos@ = £, ora = r cos@. 
Similarly, the fact that the sine of @ is the length of the opposite side divided by the 
length of the hypotenuse gives sin@ = a orb=rsing. 


ye 


Fig. 9.2. Representation of a complex number with a > 0 and b > 0 


The general case, for any real numbers a and b, requires the general definition 
of sine and cosine for all angles, not just for those that are less than a right angle. 
The circle with center at (0, 0) and radius | is called the unit circle. If 6 is an angle 
measured in a counterclockwise direction from the positive x-axis to the line from 
(0, 0) to the point (x, y) on the unit circle, then cos @ is defined to be x and sin @ is 
defined to be y. For example, when 6 is 0, the corresponding point on the unit circle 
is (1, 0). Thus, cos(0) = 1 and sin(0) = 0. When @ equals s the corresponding 
point on the unit circle is (0, 1), so cos (3) = 0 and sin (3) = |. One can similarly 
check that cos(z7) = —1, sin(zr) = 0, cos (2) = 0 and sin (2) =-l. 

It is easy to see that, if a and b are both greater than 0, the generalized definitions 
of sine and cosine coincide with the original ones. 

If a + bi is a complex number other than 0, then its modulus, r = Va? + b?, is 
not 0. The point (¢ 2) lies on the unit circle. The line from (0, 0) to (a, b) is the 


eae 4 
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same line as the line from (0, 0) to (4, 2) (since they both have the same slopes and 
they both pass through the origin). Let 6 be the argument of a+bi. Since 6 is also the 
angle from the positive x-axis to the line from (0, 0) to (4, b), the definitions yield 
cosé = 7 and sin@ = B. Thus, rcos@ = a andr sin@ = b, from which it follows 
that a + bi = r(cos@ +i sin@). (The only complex number whose modulus is 0 is 


the number 0, and 0 is the only complex number whose argument is not defined.) 


Definition 9.2.3. The polar form of the complex number with modulus r and 
argument 6 is r(cos@ + i sin@). 


One reason that the polar form is important is because there is a very nice 
description of multiplication of complex numbers in terms of their moduli and 
arguments. 


Theorem 9.2.4. The modulus of the product of two complex numbers is the product 
of their moduli. The argument of the product of two complex numbers is the sum of 
their arguments. 


Proof. Simply multiplying the two complex numbers r;(cos 6; + isin@,) and 
r2(cos 62 + i sin 82) and collecting terms yields 


r1r2 ((cos 6, cos 62 — sin 0, sin 62) + i(cos 0; sin 62 + sin 6; cos 02) 
We require the addition formulae for cosine and sine: 
cos (6; + 62) = cos 0 cos 62 — sin 0] sin 62 
and 
sin (0; + 02) = sin 6; cos 07 + sin 6 cos 6; 


Using these addition formulae in the above equation shows that the product is 
equal to 


rir2( cos (0; + 62) +i sin (6) + 62)) 


This proves the theorem. Oo 


Thus, to multiply two complex numbers, we can simply multiply their moduli 
and add their arguments. In particular, the case where the two complex numbers 
are equal to each other shows that the square of a complex number is obtained by 
squaring its modulus and doubling its argument. One application of this fact is the 
following. 
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Theorem 9.2.5. Every complex number has a complex square root. 


Proof. To show that any given complex number has a square root, write the given 
number in polar form, say z = r(cos@ + i sin@). Let w equal Jr( cos g + isin $). 
By the previous theorem (9.2.4), w? = z. oO 


It is also easy to compute powers higher than 2. 


De Moivre’s Theorem 9.2.6. For every natural number n, 


(r(cos @ + i sin @))" = r"(cosné + isinnd) 


Proof. This is easily established by induction on n. The case n = | is clear. Suppose 
that the formula holds for n = k; that is, suppose 


(r(cos 6 + i sin6))* = r*(coskO + i sin k6) 
Multiplying both sides of this equation by r(cos 6+i sin @) and using Theorem 9.2.4 
gives 


)" = r(coské +i sink0) - r(cos6 + i sin§) 


(r(cos @ + i sin 6) 
=r-r*(cos(kO +6) +i sin (k@ + 6)) 
= r**!(cos ((k + 1)) +i sin ((k + 1)6)) 
This is the formula for 1 = k + 1, so the theorem is established by mathematical 


induction. oO 


De Moivre’s Theorem leads to some very nice computations, such as the 
following. 


Example 9.2.7. We can compute (1 + i)® as follows. First, |1 + i] = 2. Plotting 
1+ as the point (1, 1) in the plane makes it apparent that the argument of 1+ is 7. 


Thus, by De Moivre’s Theorem (9.2.6), the modulus of (1-++)® is (/2)° =24*=16 
and the argument is 8 - 7 = 2z. It follows that 


(1 +i)® = 16(cos 2x +i sin27) = 16 


Therefore, (1 +i)® = 16. 
The following is a very similar computation. 
Example 9.2.8. 
100 wu T\\100 _ 450 
d+i)" = (v2 (cos = + isin )) = 2° (cos 25z +i sin 257) 


Since the angle with the positive x-axis of 257 radians is in the same position as the 
angle of z radians, it follows that 
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(17)! 22” (cosa + ising) = 2-14 0) = —2° 


It is useful to have expressions for the cosines and sines of certain angles. 


Example 9.2.9. Some particular values of cosine and sine are the following: 


_ 2 wa V2 x_ V3 oa 1 V3 


= = a_i nz — x3 
cos 7 = +5, SiInZ = +, COS | = 7» SINS = 5,COS F = 5, and sin = = om 


Proof. To determine the trigonometric functions of 4, simply note that a right 
triangle that has an angle of 7 is isosceles. If such a right triangle has legs of length 


1, the Pythagorean Theorem implies that the hypotenuse has length /2. Thus, cos z 


7 are both a = = 

For the remaining angles, begin with an equilateral triangle whose sides all have 
length 1. Then bisect one of the angles, as shown in Figure 9.3. This divides the 
equilateral triangle into two right triangles. Since the original triangle is equilateral, 
and the angles of a triangle sum to zr, each angle of the original triangle is }. Thus, 


the smaller angles in each of the right triangles are each %. By the Pythagorean 


and sin 


Theorem, the bisector of the angle has length Seo The cates of cosine and sine 
for % and are then immediate from the definitions, using either of the right 
triangles. Oo 


als 
als 


_ 
IS 
— 


w/a 
w/a 


vl- 


1 
2 


Fig. 9.3 Calculating the cosine and sine of | and % 


Using these values of cosine and sine, it is easy to calculate some related values. 
The following example is typical. 


Example 9.2.10. We can compute cos °e = and sin a as follows. Place the angle so 
that one side is on the positive x-axis and the angle is measured counterclockwise 
from there. Let (x, y) be the corresponding point on the unit circle. Then x = 


cos °y and y = sin ar We use the right triangle pictured in Figure 9.4. Since 
sae — 5. it follows that x = —5. Since sin 5 — ae it follows that 


ya ; 
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Ne 


(x, y) 


2X: 


Fig. 9.4 Determining the cosine and sine of 2 


It is interesting to compute the roots of the complex number 1. The number | is 
sometimes called unity. 


Example 9.2.11 (Square Roots of Unity). Obviously, 17 = 1 and (—1)* = 1. Are 
there any other complex square roots of 1? 

To compute the square roots of | we can proceed as follows. Let z = r(cos@ + 
isin@). By De Moivre’s Theorem (9.2.6), 27 = r?(cos20 + isin20). If z7 = 1, 
then r2 must be the modulus of 1; i.e., r2 = 1. Since r is nonnegative, it follows 
that r = 1. Also, cos20 + isin26 = 1. Therefore, cos20 = 1 and sin20 = 0. 
What are the possible values of 6? Clearly, 6 = 0 is one solution, as is 6 = zr; the 
corresponding values of z are z = cosO+isinO0 = | andz=cosm+isinz = —l. 

Are there any other possible values of 6? Of course there are: for example, 0 
could be 27 or 3 or 47 or 5z. If 6 is any multiple of z, then cos20 = 1| and 
sin 20 = 0. However, we do not get any new values of z by using those other values 
of 6. We only get z = 1 or z = —1 depending upon whether we have an even or 
an odd multiple of zr. It is easily seen that only the multiples of 7 simultaneously 
satisfy the equations cos 20 = 1 and sin26 = 0. (This follows from the fact that 
cos @ = | only when @ is a multiple of 277, so cos 20 = 1 only when @ is a multiple 
of zr.) Thus, the only complex square roots of 1 are 1 and —1. Oo 


Cube roots of unity are more interesting. The only real number z that satisfies 
z> = lis z = 1. However, there are other complex numbers satisfying this equation. 


Example 9.2.12 (Cube Roots of Unity). Suppose that z = r(cos@ + isin@) and 

z> = 1. Thenclearly r = 1. By De Moivre’s Theorem (9.2.6), z> = cos 30+ sin 30. 

From z? = | we get cos36 = 1 and sin3@ = 0. These equations are, of course, 

satisfied by 8 = 0, which gives z = cos0 + i sinO = 1, the obvious cube root of 1. 

But also cos 36 = | and sin3@ = 0 when 36 = 27. That is, when 0 = at Thus, 
21 V3 


Z=Ccos at +isin z= -5 + Bi is another cube root of 1. 
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There is anges cube root of 1. If 36 = 4z, then cos36 = 1 and sin30 = 0. 
An 1 V3 


Thus, z = cos “ + isin 3 ==5> 7 is another cube root of 1. Therefore, we 


have found ee cube roots of 1: 1, 4 + 3; and 5 3; , 

Are there any other cube roots of 1? If 36 = = 677, then cos 30 = 1| and sin3é = 0. 
When 30 = 6z, 0 = 2x. Thus, cos@ + i sin@ is simply 1, so we are not getting 
an additional cube root. More generally, for every integer k, cos2km = 1| and 
sin2k2 = 0. However, if 36 = 2kz., then there are only the three different values 
given above for cos @ + i sin@, since all the values of 6 obtained from other values 


of k differ from one of 0, =E and a by a multiple of 27. oO 


It is interesting to plot the three cube roots of unity in the plane. The three cube 
roots of unity are obtained by starting at the Po 1 on the circle of radius 1 and 
then moving in a counterclockwise direction 22 to get the next cube root and then 
moving an additional ae to get the third cube at (Figure 9.5). 

Similarly, for each natural number n, the complex n' roots of 1 can be obtained 
by starting at 1 and successively moving around the unit circle in a counterclockwise 
direction through angles of am 


>< 


Fig. 9.5 The cube roots of 1 


Example 9.2.13 (n" Roots of pie For each ae eas number n, the complex n"™ 


roots of 1 are the numbers 1, cos 2 7 ™ + isin 2% cos = _ ™ +i sin 4 cos & oa. 


‘ ‘ aa n? . n? 
cos 8 + i sin 8, ..., cos 22@—) + i sin mn cult al 


Proof. To see this, first note that, for any natural number k, 


Qnk . . 2nk\" a 
cos —— +isin—— ] =cos2z7k+isin27k 
n n 
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by De Moivre’s Theorem (9.2.6). euiee cos 27k + isin27k = 1, this shows that 
each of cos 2ak + isin ak j is an n* Pci of unity. 

To show that these are a only n' roots of unity, we proceed as follows. Suppose 
that z = cos@ + i sin@ and z” = 1. Then cosn@ +i sinn@ = 1, socosn@ = 1 and 
sinn@ = 0. Thus, n6 = 27k “4 some integer k. It follows that 9 = anh . Taking 
k=0,1,...,n-1 pes n' roots that we have listed. Taking other vaities of k 
gives dimeient values for 2=£, but each of them differs from one of the listed values 
by a multiple of 27 and iieretane gives a value for fos 8 +isin@ that we already 
have. Thus, the n roots that we listed are all of the n" roots of unity. Oo 


Roots of other complex numbers can also be computed. 


Example 9.2.14. All of the solutions of the equation z> = 1 + i can be found as 
follows. 

First note that |1 + i] = /2 and the argument of | + i is a That is, 1 +i = 
J/2 (cos ¥ + isin t). Suppose that z = r(cos@ + i sin@) a z? = 1+ i. Then 


2? = r3(cos 30 + i sin36). Therefore r> = V2, sor = De. Clearly 3@ could be 
7, in which case 6 is 75. But also, 36 could be 7 plus any integer multiple of 
2x. In particular, 30 = 4 + 2m yields 9 = 37 and 30 = 4 + 47 yields 0 = 
ie . This gives a sions of the equation z> = 1 +i: es (cos 7 7 +isin ns), 
2 (cos 3% 37 + sin 3Z), and 26 (cos 12 1 +isin Yt). 

here. are two different ways of seeing that these three are the only solutions of 
the equation. One way is to verify that 39 = | + 27k for any integer k implies that 


6 differs from one of +5, aa and ie by an integer multiple of 27. Alternately, 
this follows from the fact that a cubic polynomial has at most three roots (see 
Theorem 9.3.8 below). oO 


9.3. The Fundamental Theorem of Algebra 


One reason for introducing complex numbers was to provide a root for the 
polynomial x* + 1. There are many other polynomials that do not have any real 
roots. For example, if p(x) is any polynomial, then the polynomial obtained by 
writing out ( Dp (x)? + 1 has no real roots, since its value is at least 1 for every value 
of x. 

Does every such polynomial have a complex root? More generally, does every 
polynomial have a complex root? There is a trivial sense in which the answer to 
this question is “no,” since constant polynomials other than 0 clearly do not have 
any roots of any kind. For other polynomials, the answer is not so simple. It is 
a remarkable and very useful fact that every non-constant polynomial with real 
coefficients, or even with complex coefficients, has a complex root. 


The Fundamental Theorem of Algebra 9.3.1. Every non-constant polynomial 
with complex coefficients has a complex root. 
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There are a number of different proofs of the Fundamental Theorem of Algebra. 
They all rely on mathematical concepts that we do not develop in this book. We will 
therefore simply discuss implications of this theorem without proving it. 

How many roots can a polynomial have? 


Example 9.3.2. The only root of the polynomial p(z) = z* — 6z + 9 is z = 3. This 
follows from the fact that p(z) = (z — 3)(z — 3). Since the product of two complex 
numbers is 0 only if at least one of the numbers is 0, the only solution to p(z) = 0 
is z= 3. 

In some sense, however, this polynomial has 3 as a “double root’; we’ll discuss 
this a little more below. 


To explore the question of the number of roots that a polynomial can have, we 
need to use division of one polynomial by another. This concept of division is very 
similar to “long division” of one natural number by another. Actually, we only need 
a special case of this concept, the case where the polynomial divisor is linear (i.e., 
has degree 1). We begin with an example. 


Example 9.3.3. To divide z — 3 into z+ + 5z> — 2z + 1, proceed as follows: 


2 +8z* +24z +70 


z—3)z +523 —2z +1 
z+ —3z3 
823 —2z +1 
823 —242? 

DAge = Og <4] 

24z? —72z 
70z =+41 
70z —210 
211 


What this calculation shows (like with long division of numbers) is that 
z44523 - 2241 = (2 —3)(z? + 827 + 242 +70) + 211 


The only consequence of the division of one polynomial by another that we need 
for present purposes is the following. 


Theorem 9.3.4. [fr is a complex number and p(z) is a non-constant polynomial 
with complex coefficients, then there exists a polynomial q(z) and a constant c such 
that 


D(z) = (z—r)q(z) +e 
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Proof. We will proceed by using the Principle of Complete Mathematical Induc- 
tion (2.2.1) on the degree of the polynomial p(z). Since p(z) is non-constant, the 
base case of our induction proof is when the degree of p(z) is 1. In other words, 
P(Z) = az +b, where a and b are complex numbers and a # 0. Let r be a complex 
number. As in Example 9.3.3, we use long division to divide z — r into p(z): 


az —ar 


ar+b 


This shows that p(z) = az+b = (z—r)-a+(ar+b). Therefore, setting g(z) = a 
and c = ar + b gives us the desired result when the degree of p(z) is 1. Thus, the 
base case of the induction is established. 

Now assume that the theorem is true for all polynomials of degree less than or 
equal to n. Using this assumption, we will show that the theorem holds for every 
polynomial of degree n + 1. Let p(z) be a polynomial with complex coefficients 
with degree n + 1. That is, 


1 —1 
P(Z) = an4iZ"*! + nz" + nz") ++++ Faiz tao 


where each a; is acomplex number and a,+4 is nonzero. Let r be a complex number. 
Once again we use “long division” to divide z — r into p(z): 


An+1z" 
Y Slo r) Gani +an2" +ay—12" | +--+ +a1z +49 
apa —ran412" 


(Qn + 7Gn41)Z" +an—12""! +++» taiz +49 


To simplify the notation, let py (z) = (Qn + ray41)Z" + Gn-1Z" 1 + Faz +ap. 
Then the above gives p(z) = (z —1r)(@n412") + Pn(Z). Since p,(z) is a polynomial 
of degree less than or equal to n, the induction hypothesis tells us that there exist a 
polynomial g,(z) and a constant c such that p;,(z) = (z — r)gn(z) + c. Thus, 


P(Z) = (Z—1r)(n41z") + pa) = (Z—1)(Gn412") + (Z —N)gn(Z) + € 


= (—r)(an41z" + gn) + 


Therefore, setting g(z) = an41z" +4n(z) gives us the desired result when the degree 
of p(z)isn +1. oO 
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Definition 9.3.5. The polynomial f(z) is a factor of the polynomial p(z) if there 
exists a polynomial q(z) such that p(z) = f(z)q(z). 


The Factor Theorem 9.3.6. The complex number r is a root of a polynomial p(z) 
if and only if z — r is a factor of p(z). 


Proof. If (z — r) is a factor of p(z), then p(z) = (z — r)q(z) implies that p(r) = 
(r —r)q(r) = 0- g(r) = 0. Conversely, suppose that r is a root of p(z). By 
Theorem 9.3.4, p(z) = (z — r)q(z) + c for some constant c. Substituting r for 
z and using the fact that r is a root gives O = (r —r)q(r) +c, so0 = 0+ ¢, 
from which it follows that c = 0. Hence, p(z) = (z — r)q(z) and z — r is a factor 
of p(z). oO 


Example 9.3.7. The complex number 2: is a root of the polynomial iz? + z7 — 4 (as 
can be seen by simply substituting 27 for z in the expression for the polynomial and 
noting that the result is 0). It follows from the Factor Theorem (9.3.6) that z — 2i 
is a factor of the given polynomial. Doing “long division” gives iz? + z7 —4 = 
(z — 2i)(Gz* — z—2i). 


We can use the Factor Theorem to determine the maximum number of roots that 
a polynomial can have. 


Theorem 9.3.8. A polynomial of degree n has at most n complex roots; if “multi- 
plicities” are counted, it has exactly n roots. 


Proof. Let p(z) be a polynomial of degree n. If n is at least 1, then p(z) has 
a root, say r;, by the Fundamental Theorem of Algebra (9.3.1). By the Factor 
Theorem (9.3.6), there exists a polynomial qj (z) such that p(z) = (z—11)q1(z). The 
degree of q; is clearly n—1.Ifn—1 > 0, then qj (z) has a root, say rg. It follows from 
the Factor Theorem that there is a polynomial g2(z) such that g1(z) = (z—1r2)q2(z). 
The degree of g2(z) is n — 2, and 


p(z) = (2 —ri)(z — r2)q2(z) 


This process can continue (a formal proof can be given using mathematical 
induction) until a quotient is simply a constant, say k. Then, 


p(z) =k(z—r1)(Z —12)---(Z— mn) 


If the r; are all different, the polynomial will have n roots. If some of the r; 
coincide, collecting all the terms where 7; is equal to a given r produces a factor of 
the form (z —r)”, where m is the number of times that r occurs in the factorization. 
In this situation, we say that r is a root of multiplicity m of the polynomial. Thus, 
a polynomial of degree n has at most n distinct roots. If the roots are counted 
according to their multiplicities, then a polynomial of degree n has exactly n 
roots. Oo 
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9.4 Problems 


Basic Exercises 


. Write the following complex numbers in a + bi form, where a and b are real 


numbers: 
10 3+2i 
i (dq) 45 
. (at aA) (e) 3 
©) (5+) oi 
3 j 11 (g) pls 
(ce) (+5 
© (444) cn) 4 


. Show that the real part of (1 + i)!9 is 0. 
. Find both square roots of each of the following numbers: 


(a) —i 
(b) —15 — 8i 
[Hint: Suppose (a + bi)? = —15 — 8i and compute a and b.] 


. Find all the cube roots of each of the following numbers: 


(a) 2 
(b) 8/3 + 8i 


. (a) Prove that the conjugate of the sum of any two complex numbers is the sum 


of their conjugates. 
(b) Prove that the conjugate of the product of any two complex numbers is the 
product of their conjugates. 


Interesting Problems 


6. 


Prove the Quadratic Formula; i.e., prove that the polynomial az? + bz + ¢, 
where a, b and c are any complex numbers and a is different from 0, has roots 


—b+vV/ b?—4ac 


L= 
2a 
‘ ‘ ; 2 
[Hint: Rewrite the equation as z7+ wan < = 0, and use the fact that (z + +) = 


2, 6b by 
Eo ek aa. | 


. Find all solutions to the equation iz? + 2z +i = 0. 
. Find a polynomial p with integer coefficients such that p (3 + iv7) =0. 
. Find all the complex roots of the polynomial z° + z? + 1. 


10. 
11. 


Find all the complex roots of the polynomial z’ — z. 
Find a polynomial whose complex roots are 2 — i, 2 +i, 7. 


88 9 The Complex Numbers 
Challenging Problems 


+1 

12. Find all the complex solutions of = i. 

13. Let p be a polynomial with real coefficients. Prove that the complex conjugate 
of each root of p is also a root of p. 

[Hint: Use Problem 5 in this chapter. ] 

14. Show that every non-constant polynomial with real coefficients can be factored 
into a product of linear (i.e., of degree 1) and quadratic (1.e., of degree 2) 
polynomials, each of which also has real coefficients. 

15. Extend De Moivre’s Theorem (9.2.6) to prove that, for negative integers n, 


(r(cos6 +i sin @))” = r"(cosné +i sinn@) 


Chapter 10 Mm) 
Sizes of Infinite Sets ml 


How many natural numbers are there? How many even natural numbers 
are there? How many odd natural numbers are there? How many rational numbers 
are there? How many real numbers are there? How many points are there in the 
plane? How many sets of natural numbers are there? How many different circles are 
there in the plane? An answer to all these questions could simply be: there are an 
infinite number of them. But there are more precise answers that can be given; there 
are, in a sense that we will explain, an infinite number of different size infinities. 


10.1. Cardinality 


Definition 10.1.1. By a set we simply mean any collection of things; the things are 
called elements of the set. (As will be discussed at the end of this chapter, such a 
general definition of a set is problematic in certain senses.) 


For example, the collection of all words on this page is a set. The collection 
containing the letters a, b, and c is a set: it could be denoted {a, b, c}. The set of all 
real numbers greater than 4 could be written: 


{x:x > 4} 


The fact that something is an element of a set is often denoted using the Greek letter 
epsilon, €. We write x € S to represent the fact that x is an element of the set S. 
For example, if S = {x : x > 4}, then 17 € S. 


Definition 10.1.2. If S is a set, a subset of S is a set all of whose elements are 
elements of the set S. The notation 7 C S is used to signify that 7 is a subset of S. 
The empty set is the set that has no elements at all. It is denoted 4. The empty set is, 
by definition, a subset of every set. That is, @ C S for every set S. 


© Springer Nature Switzerland AG 2018 89 
D. Rosenthal et al., A Readable Introduction to Real Mathematics, 
Undergraduate Texts in Mathematics, https://doi.org/10.1007/978-3-030-00632-7_10 


90 10 Sizes of Infinite Sets 


The union of a collection of sets is the set consisting of all elements that occur 
in at least one of the given sets. The union of sets S and J is denoted SUZ and 
similar notation is used for the union of more than two sets. 

The intersection of a collection of sets is the set consisting of all elements that are 
in every set in the given collection. The intersection of the sets S and J is denoted 
SO and similar notation is used for the intersection of more than two sets. If the 
intersection of two sets is the empty set, the sets are said to be disjoint. 


How should we define the concept that two sets have the same number of 
elements? For finite sets, we count the number of elements in each set. When 
we count the number of elements in a set, we assign the number | to one of 
the elements of the set, then assign the number 2 to another element of the set, 
then 3 to another element of the set, and so on, until we have counted every 
element in the set. If the set has n elements, when we finish counting we will 
have assigned a number in the set {1,2,3,...,} to each element of the set and 
will not have assigned two different numbers to the same element in the set. That 
is, counting that a set has n elements produces a pairing of the elements of the 


set {1,2,3,...,} with the elements of the set that we are counting. A set whose 
elements can be paired with the elements of the set {1, 2,3, ...,} is said to have n 
elements. 


More generally, we can say that two sets have the same number of elements if 
the elements of those two sets can be paired with each other. 


Example 10.1.3. Pairs of running shoes are manufactured in a given factory. Each 
day, some number of pairs is manufactured. Even without knowing how many pairs 
were manufactured in a given day, we can still conclude that the same number of 
left shoes was manufactured as the number of right shoes that was manufactured, 
since they are manufactured in pairs. If, for example, the number of left shoes was 
determined to be 1012, then it could be concluded that the number of right shoes was 
also 1012. This could be established as follows: since the set {1, 2, 3,..., 1012} can 
be paired with the set of left shoes, it could also be paired with the set of right shoes, 
simply by pairing each right shoe to the number assigned to the corresponding left 
shoe in the pair. 


The above discussion suggests the general definition that we shall use. In the 
following, the phrase “have the same cardinality” is the standard mathematical 
terminology for what might colloquially be expressed “have the same size.” 

We will say that the sets S and J “have the same cardinality” if there is a pairing 
of the elements of S with the elements of 7. 

We need to precisely define what is meant by a “pairing” of the elements of two 
sets. This can be specified in terms of functions. A function from a set S into a set 
J is simply an assignment of an element of J to each element of S. For example, 
if S = {a,b,d,e} andJ = {+, 7}, then one particular function taking S to 7 is 
the function f defined by f(a) = 7, f(b) = 27, f(d) =+, and f(e) = 7Z. 
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Definition 10.1.4. The notation f : S > J is used to denote a function f taking 
the set S into the set J; that is, a mapping of each element of S to an element of T. 
The set S is called the domain of the function. The range of a function is the set of 
all its values; that is, the range of f : S > J is { f(s): 5 € S}. 


Definition 10.1.5. A function f : S + TJ is one-to-one (or injective) if f(s1) 4 
Ff (s2) whenever s; 4 s2. That is, a function is one-to-one if it does not send two 
different elements to the same element. 


We also require another property that functions may have. 


Definition 10.1.6. A function f : S > J is onto (or surjective) if for every t € T 
there is an s € S such that f(s) = f; that is, the range of f is all of T. 


Note that a one-to-one, onto function from a set S onto a set 7 gives a pairing of 
the elements of S with those of T. 

The formal definition of when sets are to be considered to have the same size can 
be stated as follows. 


Definition 10.1.7. The sets S and J have the same cardinality if there is a function 
f :S— F that is one-to-one and is onto all of 7. 


We require the concept of the inverse of a function. If f is a one-to-one function 
mapping a set S onto a set 7, then there is a function mapping 7 onto S that “‘sends 
elements back to where they came from” via f. 


Definition 10.1.8. If f is a one-to-one function mapping S onto 7, then the inverse 
of f, often denoted f—!, is the function mapping J onto S defined by f~!(t) = s 
when f(s) = ¢. 


With respect to this definition, note that f must be onto for f—! to be defined on 
all of 7. Also, f must be one-to-one; otherwise for some f there will be more than 
one s for which f(s) = ¢ and therefore f —!(¢) will not be determined. If fisa 
one-to-one function mapping S onto J, then f—! is a one-to-one function mapping 
F onto S. 

Let’s consider some examples. 


Example 10.1.9. The set of even natural numbers and the set of odd natural numbers 
have the same cardinality. 


Proof. Write the set of even natural numbers as 6 = {2,4,..., 2, ...} and the set 
of odd natural numbers as O = {1,3,...,21 — 1, ...}. To satisfy Definition 10.1.7, 
we need to show that there is a one-to-one function taking 6 onto O. Define a 
function f taking 6 —> O by letting f(k) = k — 1, for each k in &. To see that 
this f is one-to-one, simply note that k; — 1 = kz — 1 implies ky = ko. Also, f is 
clearly onto. Thus, the sets & and O have the same cardinality. Oo 

It is not very surprising that the set of even natural numbers and the set of odd 


natural numbers have the same cardinalities. The following example is a little more 
unexpected. 
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Example 10.1.10. The set of even natural numbers has the same cardinality as the 
set of all natural numbers. 


Proof. This is surprising at first because it seems that the set of even numbers should 
have half as many elements as the set of all natural numbers. However, it is easy to 
prove that these sets, & and N, do have the same cardinality. Simply define the 
function f : N > & by f(m) = 2zn. It is easily seen that f is one-to-one: if 
f(m) = f(n2), then 2n; = 2n2, son; = n2. The function f is onto since every 
even number is of the form 2k for some natural number k. Therefore, N and & have 
the same cardinality. oO 


Thus, in the sense of the definition we are using, the subset & of N has the 
same size as the entire set N. This shows that, with respect to cardinality, it is not 
necessarily the case that “the whole is greater than any of its parts.” 

Another example showing that “the whole” can have the same cardinality as “one 
of its parts” is the following. 


Example 10.1.11. The set of natural numbers and the set of nonnegative integers 
have the same cardinality. 


Proof. The set of natural numbers is N = {1,2,3,...}. Let S denote the set 
{0, 1, 2,3, ...} of nonnegative integers. We want to construct a one-to-one function 
f taking S onto N. We can simply define f by f(n) = n+ 1, for eachn in S. 
Clearly f maps S onto N. Also, f(m1) = f (m2) implies nj + 1 = n2 + 1, which 
gives n; = n2. That is, f does not send two different integers to the same natural 
number, so f is one-to-one. Therefore, N and S have the same cardinality. oO 


The following notation is useful. 


Notation 10.1.12. We use the notation |S| = |7 | to mean that S and J have the 
same cardinality. 


Therefore, as shown above, |O| = |6| = |N]. 

How does the size of the set of all positive rational numbers, which we will 
denote by QT, compare to the size of the set of natural numbers? The subset of QT 
consisting of those rational numbers with numerator | can obviously be paired with 
N: simply pair i with n, for each n in N. But then there are all the rational numbers 
with numerator 2, and with numerator 3, and so on. It seems that there are many 
more positive rational numbers than there are natural numbers. However, we now 
prove that |N| = |Q*]. 


Theorem 10.1.13. The set of natural numbers and the set of positive rational 
numbers have the same cardinality. 


Proof. To prove this theorem, we first describe a way of displaying all the positive 
rational numbers. We imagine writing all the rational numbers with numerator | in 
one line, and then, underneath that, the rational numbers with numerator 2 in a line, 
and under that the rational numbers with numerator 3 in a line, and so on. That is, 
we consider the following array: 
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i, a As whe. cio a at 
2 3 4 3 6 Ff 
By We, Oe et By 1 
2 3 4 5 6 #7 
gS Ss Ss. Be Bs 
2 3 4 5 6 #7 
4 4 4 4 4 4 4 
2 3 4 3 6 7 


Imagining the positive rational numbers arranged as above, we can show that the 
natural numbers can be paired with them. That is, we will define a one-to-one 
function f taking N onto Q*. As we define the function, you should keep looking 
back at the array to see the pattern that we are using. 

Define f (1) = f and f(2) = 5. (We can’t continue by f(3) = 3, f(4) = 4... 
for then f would only map onto those rational numbers with numerator 1.) Define 
f@) = + and f(4) = 3. We can’t just keep going down in our array; we must 


include the numbers above as well. We do not include 5 however, since 5 = b 
which is already paired with 1. Thus, we let f(5) = 4, f(6) = 4, f(7) = 4, 
f(8) = 3, fQ = i, and f(10) = 2. We do not consider 3 since 5 = ‘, 
and we do not consider 3 = i or 4 — 5 Thus, f(11) is defined to be ; and 


fd2) = é- It is apparent that a pairing of the natural numbers and the positive 
rational numbers is indicated by continuing to label rational numbers with natural 
numbers in this manner, “zigzagging,’ you might say, through the above array. 
Therefore, |QT| = |NJ. o 


10.2 Countable Sets and Uncountable Sets 


You may be wondering whether or not every infinite set can be paired with the set 
of natural numbers. If the elements of a set can be paired with the natural numbers, 
then the elements can be listed in a sequence. For example, if we let s; be the 
element of the set corresponding to the natural number 1, s2 be the element of the 
set corresponding to the natural number 2, s3 to 3, and so on, then the set could 
be displayed: 


{51, 52, 53, ...} 


Pairing the elements of a set with the set of natural numbers is, in a sense, “counting 
the elements of the set.” 


94 10 Sizes of Infinite Sets 


Definition 10.2.1. A set is countable (sometimes called denumerable, or 
enumerable) if it is either finite or has the same cardinality as the set of natural 
numbers. A set is said to be uncountable if it is not countable. 


Definition 10.2.2. For a and b real numbers with a < b, the closed interval from a 
to b is the set of all real numbers between a and J, including a and b. It is denoted 
[a, b]. That is, [a,b] = {x :a<x <)}. 


One example of an uncountable set is the following. 
Theorem 10.2.3. The closed interval [0, 1] is uncountable. 


Proof. We must prove that there is no way of pairing the set of natural numbers 
with the interval [0, 1]. To establish this, we will show that every pairing of natural 
numbers with elements of [0, 1] fails to include some members of [0, 1]. In other 
words, we will show that there does not exist any function that maps N onto [0, 1]. 

We use the fact that the elements of [0, 1] can be written as infinite decimals; 
that is, in the form .c}c2c3..., where each c; is a digit between O and 9. (This 
fact will be formally established in Chapter 13 of this book (see Theorem 13.6.3).) 
Some numbers have two different such representations. For example, .9999 ... is 
the same number as 1.0000..., and .19999... is the same number as .20000... 
(see Section 13.6). For the rest of this proof, let us agree that we choose the repre- 
sentation involving an infinite string of 9’s rather than the representation involving 
an infinite string of 0’s for all numbers that have two different representations. 

Suppose that f is any function taking N to [0, 1]. To prove that f cannot be onto, 
we imagine writing out all the values of f in a list, as follows: 


fC) = .411412A13d|4a]15... 
f (2) = .a21422a73a24a25 .. . 
f B) = .a31432033034035 ... 
f(A) = .41. 442443044045... 


f (S) = .451452453054055 ... 


We now construct a number in [0, 1] that is not in the range of the function f. We 
do that by showing how to choose digits b; so that the number x = .b) b2b3b4... is 
not in the range of f. Begin by choosing by = 3 if aj; A 3 and by = 4 if ay, = 3. 
Then choose b2 = 3 if azz 4 3 and bz = 4 if az2 = 3. We continue in this manner, 
choosing b; = 3 if aj; A 3 and bj = 4 if aj; = 3, for every natural number j. The 
number x that is so constructed has a unique decimal representation (since there are 
no 9’s or 0’s in its representation) and differs from f(j) in its j" digit. Therefore, 
Ff(Q) # x for all j, so x is not in the range of f. That is, we have proven that there 
is no function (one-to-one or otherwise) taking N onto [0, 1], so we conclude that 
[0, 1] has cardinality different from that of N. oO 
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Of course, any given function f in the above proof could be modified so as 
to produce a function whose range does include the specific number x that we 
constructed in the course of the proof. For example, given any such function /, 
define the function g : N — [0, 1] by defining g(1) = x and g(n) = f(n — 1), for 
n > 2. The range of g includes x and also includes the range of f. However, g does 
not map N onto [0, 1], for the above proof could be used to produce a different x 
that is not in the range of g. 

How does the cardinality of other closed intervals compare to that of [0, 1]? 


Theorem 10.2.4. Jf a and b are real numbers and a < b, then [a, b] and (0, 1] 
have the same cardinality. 


Proof. The theorem will be established if we construct a function f : [0,1] > 
[a, b] that is one-to-one and onto. That is easy to do. Simply define f by f(x) = 
a+ (b—a)x. Then f(0) =a and f(1) = D. If x is in [0, 1], then a + (6 — a)x is 
greater than or equal to a and less than or equal to b, so f takes [0, 1] into [a, b]. To 
show that f is onto, let y be any element of [a, b]. Let x = 7—“. Then x € [0, 1] 
and f(x) = y. This shows that f is onto. To show that f is one-to-one, assume 
that a + (b — a)x,; = a+ (b — a)x2. Subtracting a from both sides of this equation 
and then dividing both sides by b — a yields x; = x2. This proves that f is one- 
to-one. Thus, f is a pairing of the elements of [0, 1] with the elements of [a, b], so 
|[0, 1]] = fa, d]]. g 


There are other intervals that frequently arise in mathematics. 


Definition 10.2.5. If a and b are real numbers and a < b, then the open interval 
between a and b, denoted (a, b), is defined by 


(a,b)={x:a<x <b} 
The half-open intervals are defined by 


(a, b] = {x :a <x <b} and [a,b)={x:a<x <b} 


How does the size of a half-open interval compare to the size of the correspond- 
ing closed interval? 


Theorem 10.2.6. The intervals (0, 1] and (0, 1] have the same cardinality. 


Proof. We want to construct a one-to-one function f taking [0, 1] onto (0, 1]. We 
will define f(x) = x for most x in [0, 1], but we need to make a place for 0 to 
go to in the half-open interval. For each natural number n, the rational number i 


4 forn € N. In 


particular, f(1) = 5. Note that the number 1, which is in (0, 1], is not in the range 
of f as defined so far. We define f (0) to be 1. We define f on the rest of [0, 1] by 
f(x) = x. That is, f(x) = x for those x other than 0 that are not of the form 1 
with n a natural number. It is straightforward to check that we have constructed a 
one-to-one function mapping [0, 1] onto (0, 1]. oO 


is in both intervals. Define f on those numbers by f ( 1) = 
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Suppose that |S| = |J | and |7'| = |U/|; must |S| = |U/|? If this was not the case, 
we would be using the “equals” sign in a very peculiar way. 


Theorem 10.2.7. If |S| = |T| and |T| = |U|, then |S| = |U|. 


Proof. By hypothesis, there exist one-to-one functions f and g mapping S onto J 
and J onto U, respectively. That is, f: S—> FT andg:7 > U.Leth=gof 
be the composition of g and f. In other words, h is the function defined on S by 
h(s) = g(f(s)). We must show that / is a one-to-one function taking S onto YW. Let 
u be any element of UY. Since g is onto, there exists at inY such that g(t) = uw. Since 
f is onto, there is an s in S such that f(s) = t. Then A(s) = g(f(s)) = g(t) =u. 
Thus, / is onto. 

To see that h is one-to-one, suppose that h(s;) = h(s2); we must show that 
5S, = 52. Now g(f(s1)) = g(f(s2)), so f(s1) = f(s2) since g is one-to-one. But 
f is also one-to-one, and so s; = 52. We have shown that h is one-to-one and onto, 
from which it follows that |S| = |U|. Oo 


Theorem 10.2.8. Jf a, b, c, and d are real numbers with a < b and c < d, then the 
half-open intervals (a, b] and (c, d| have the same cardinality. 


Proof. The function f defined by f(x) = a+ (b — a)x is a one-to-one 
function mapping (0, 1] onto (a, b], as can be seen by a proof almost exactly 


the same as that in Theorem 10.2.4. Hence, |(0,1]]| = |(a,6]|. Similarly the 
function g defined by g(x) = c+ (d — c)x is a one-to-one function mapping 
(0, 1] onto (c,d], so |(0, 1]| = |(c,d]|. It follows from Theorem 10.2.7 that 
\(a, b]| = |(c, d]]. o 


Are there more nonnegative real numbers than there are real numbers in [0, 1]? 
The, perhaps surprising, answer is “no.” 


Theorem 10.2.9. The cardinality of the set of nonnegative real numbers is the same 
as the cardinality of the unit interval [0, 1]. 


Proof. We begin by showing that the set S = {x : x > 1} has the same cardinality 
as (0, 1]. Note that the function f defined by f(x) = 1 maps S into (0, 1]; for if 
x > 1, then 4 < 1. Also, f maps S onto (0, 1); for if y € (0, 1], then 4 > 1 and 


f (4) = y. Tosee that f is one-to-one, suppose that f (x1) = f (x2). Then x aes 


y x2” 


SO xX, = x2. Hence, f is one-to-one and onto, and it follows that |S| = |(0, 1)|. 
Now let 7 = {x : x > O}. Define the function g by g(x) = x — 1. Then g is 
obviously a one-to-one function mapping S onto 7. Hence |7| = |S]. Therefore, 
by Theorem 10.2.7, |7| = |(0, 1]|. But, by Theorem 10.2.6, |[0, 1]] = |(, 1]]. It 
follows that |7| = |[0, 1]]. o 


Must the union of two countable sets be countable? A much stronger result 
is true. 


10.3 Comparing Cardinalities 97 


Theorem 10.2.10. The union of a countable number of countable sets is countable. 


Proof. This can be proven using ideas similar to those used in the proof of the 
fact that the set of positive rational numbers is countable (see Theorem 10.1.13). 
Recall that “countable” means either finite or having the same cardinality as N 
(Definition 10.2.1). We will prove this theorem for the cases where all the sets are 
infinite; you should be able to see how to modify the proof if some or all of the 
cardinalities are finite. 

Suppose, then, that we have a countable collection {S;, S2, S3, ...} of sets, each 
of which is itself countably infinite. By pairing the elements of S; with the elements 
of N, we label the elements of S; so that S; = {aj1, aj2, aj3, ...}. We display the 
sets in the following array: 


S| = {411, 412, 413, 414, 15, 416, 417, ...} 
S2 = {a21, d22, 423, d24, 425, 426, d27, .. .} 
S3 = {431, 432, 433, 434, 435, 436, 437, ...} 


S4 = {a41, 442, 443, 444, 445, 446, 447, .--} 


Let S denote the union of the S;’s. To show that S is countable, we show that 
we can list all of its elements. Proceed as follows. First, list aj; and then a,2. Then 
consider a2 . It is possible that a2, is one of aj; or a}2, in which case we do not, 
of course, list it again. If, however, a2; is neither aj; nor a12, we list it next. Then 
look at a31; if it is not yet listed, list it next. Then go back up to az, then aj3, and 
so on. In this way, we “zigzag” through the entire array (as we did in the proof of 
Theorem 10.1.13) and list all the elements of S. It follows that S is countable. O 


10.3. Comparing Cardinalities 


When two sets have different cardinalities, the question arises of whether we can 
say that one set has cardinality that is less than the cardinality of the other set. 
What should we mean by saying that the cardinality of one set is less than that of 
another set? It is easiest to begin with a definition of “less than or equal to,” instead 
of “less than,” for cardinalities. 


Definition 10.3.1. If S and J are sets, we say that S has cardinality less than or 
equal to the cardinality of J, and write |S| < |J |, if there is a subset 79 of J such 
that |S| = |Tol. 


This is equivalent to saying that there is a one-to-one function mapping S into 
(not necessarily onto) J. For if f is a one-to-one function mapping S onto Zo, we 


98 10 Sizes of Infinite Sets 


can regard f as a function taking S into J. Conversely, if f is a one-to-one function 
mapping S into 7, and if Zo is the range of f, then f gives a pairing of S and 70. 


Example 10.3.2. The function f : N — [0,1] defined by f(n) = ‘ establishes 
that |N| < |[0, 1]|, since f is one-to-one. 


Note that |So| < |S| whenever So is a subset of S, since the function f : So > S 
defined by f(s) = s, for each s in So, is clearly one-to-one. 

We have defined “<” for cardinalities; how should we define “<’? The following 
definition is very natural. 


Definition 10.3.3. We say that S has cardinality less than that of J, and write 
|S| < |7|, if |S| < |7 | and |S] ¥ |71. 


Example 10.3.4. If N is the set of natural numbers and [0, 1] is the unit interval, 
then |N| < |[0, 1]]. 


Proof. By Example 10.3.2, |N| < |(0, 1]|, and, by Theorem 10.2.3, |N| 4 |[0, 1]), 
so the result follows. oO 


Thus, in the sense of the definitions we are using, there are more real numbers in 
the interval [0, 1] than there are natural numbers. 

There is a question that immediately arises from the definition of “less than or 
equal to” for cardinalities: If S and 7 are sets such that |S| < |V | and |J| < |S|, 
must |S| = |7 |? The language suggests that this question should have an affirmative 
answer, but that language doesn’t prove anything. What does this question come 
down to? We are given the fact that |S| < |7 |. That is equivalent to the existence 
of a one-to-one function f : S > J. Similarly, |7| < |S| implies that there is 
a one-to-one function g : J — S. To say that |S| = |Z | is equivalent to saying 
that there exists a function h : S — J that is both one-to-one and onto. The 
question, therefore, is whether we can show the existence of such a function h from 
the existence of the functions f and g. 

In addition to being important in justifying the above terminology, the following 
theorem is often very useful in proving that given sets have the same cardinalities. 


The Cantor-Bernstein Theorem 10.3.5. [f S and 7 are sets such that |S| < |T | 
and |T | < |S|, then |S| = |T]. 


Proof. The hypotheses imply that there exist one-to-one functions f : S > J and 
g : VJ — S; these functions may or may not be onto. We must construct a one-to- 
one function h that takes S onto J. To do this, we will break S up into three subsets 
and then define h to be the function f on two of those subsets and the function g~! 
on the third subset. 

Consider any element s of S. Such an s may or may not be in the range of g. If it is 
in the range of g, then there is exactly one element fp in such that g(to) = s, since 
g is one-to-one. Call such an element fg the “immediate ancestor” of s. Similarly, if 
tisinY and f(so) = t for some so in S, we say that so is the “immediate ancestor” 
of t. Thus, elements of S have immediate ancestors in 7 if they are in the range of 
g, and elements of 7 have immediate ancestors in S if they are in the range of f. 
Some elements may not have any immediate ancestors. 
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We will say that an immediate ancestor of an immediate ancestor of an element s 
in S is an “ancestor” of the element s. That is, if s in S has an immediate ancestor fg 
in J and fg has an immediate ancestor sg in S, then sg is an ancestor of s. Similarly, 
if t; in 7 has an immediate ancestor sj in S and s; has an immediate ancestor 
t2 in J, we say that f2 is an ancestor of t;. We continue backwards whenever 
possible. In other words, we start with a given element and then keep on finding 
immediate ancestors unless and until we reach an element that does not have an 
immediate ancestor. All the ancestors in such a chain of immediate ancestors are 
called ancestors of the original element of S or 7. 

For each element that we start with, there are three possibilities. One possibility 
is that the element has an ancestor and that every ancestor of the element also has 
an ancestor. That is, it could be that we can keep on going back and back and 
back indefinitely in the ancestry of a given element. Let S.o denote the set of all 
those elements s in S for which we can keep on finding ancestors without stopping. 
Similarly, let 74. denote the set of all ¢ in 7 for which we can keep on finding 
ancestors without stopping. (It might be noted that it is possible that we can keep 
on finding ancestors indefinitely, but nonetheless there are only a finite number of 
distinct ancestors. For example, it would be possible that, for some s in S and t inT, 
f(s) =t and g(t) = s. Then the immediate ancestor of s would be f, the immediate 
ancestor of tf would be s, the immediate ancestor of s would be ft, and so on. Thus, 
there would be no stopping the process of finding ancestors, in spite of the fact that 
each of s and ¢ has only two distinct ancestors, s and f. In this situation, s € Soo 
and t € To.) 

Those elements of S and J that are not in either of Soo or Too have what might 
be called “ultimate ancestors.” That is, since the chain of ancestors comes to a stop, 
there is a most distant ancestor. Of course, one possibility is that the element has no 
ancestors at all, in which case we say that element is its own ultimate ancestor. The 
ultimate ancestor of any given element is either in S or in J. Let Ss denote the set 
of all elements of S whose ultimate ancestor is in S and let S7 denote the set of all 
elements of S whose ultimate ancestor is in 7. Similarly, let 5 and 77 denote the 
sets of elements of J whose ultimate ancestors are in S and 7, respectively. 

Thus, we have divided S into three subsets: Soo, Ss, and Sz. Every element of 
S is in exactly one of those subsets. Similarly, every element of J is in exactly one 
of the subsets Too, Ts, or 77. (Of course, some of the subsets may be empty.) 

We can now define the function A. For s in S, we define h(s) to be f(s) ifs is in 
either Soo or Sg, and we define h(s) to be g~!(s) if s is in Sy. Note that g~!(s) is 
defined for all s € Sz since all the elements of S7 have immediate ancestors in 7. 
We will show that h is a one-to-one function taking S onto. 

Let’s first show that h is one-to-one. Suppose that h(s;) = h(s2) for s; and sz 
in S. We must show that s; = s2. If both of sy and sz are in the union of S. and 
Ss, then h(s;) = f(s1) and h(s2) = f(s2). Therefore, f(s,) = f(s2). Since f is 
one-to-one, it follows that s; = sz in this case. Similarly, if both of s; and sz are 
in Sy, then h(s}) = g~!(s1) and h(s2) = g~!(s2). Therefore, g~!(s1) = g7!(s2). 
Applying g to both sides of this equation gives s; = sz in this case. 
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One case remains: the case where one of s; and s2 is in the union of Ss and 
Soo and the other is in S7. Suppose that sj € Soo U Sg and s2 € Sz. Then 
h(s}) = f(s1) and h(s2) = g~!(s2). Therefore, f(s) = g~!(s2). We show that 
this case cannot arise. If f(s}) = g—!(s2), then sj is an immediate ancestor of 
eg! (sz). Thus, sj is an ancestor of sz. But s2 is in S7, so it has an ultimate ancestor 
inJ. Since s; is an ancestor of 52, the ultimate ancestor of 52 is the ultimate ancestor 
of s;. But s; being in Sgo US¢g implies that s; either has no ultimate ancestor or has 
an ultimate ancestor in S. This is inconsistent with having an ultimate ancestor in 
J , so this case does not arise. 

We have proven that the function / that we constructed is one-to-one. It remains 
to be shown that h maps S onto T. 

Each t in J is in one of Ts, Too, or 7. We must show that, wherever tf lies, 
there is an s in S such that h(s) = t. Suppose first that tf € Js. Since t has an 
ultimate ancestor in S, it follows that t is in the range of f, so we can consider 
f —l(t), The ancestors of f —!(¢) are also ancestors of t, from which it follows that 
the ultimate ancestor of f —l(¢) is in S. That is, f —l(t) is in Sg. The function h is 
defined to be f on Ss, so h(f~!(t)) = f(f~!(t)) = t. This shows that the range 
of h contains every element of Js. 

Now consider any ft in Too. Such at has an immediate ancestor in S, f~!(t). 
Since the ancestors of f —l(t) are also ancestors of ft, f —l(¢) has no ultimate 
ancestor. That is, f —l(¢) is in Soo. The function h was defined to be the function 
f on Soo, so h(f~'(t)) = f(f7'()) = t. This proves that the range of h 
contains Joo. 

All that remains to be shown is that the range of h includes 77. Suppose, then, 
that ¢t is in 77. Let s = g(t). Then ¢ is the immediate ancestor of s. Thus, the 
ultimate ancestor of ¢ is the ultimate ancestor of s. Since the ultimate ancestor of t 
is in J, the ultimate ancestor of s is in J. In other words, s is in Sz. On elements 
of Sz, h is defined to be g~!. Thus, h(s) = g~!(s) and, since s = g(t), h(s) = 
go! (g(t)) = t. This establishes that the range of h includes 77. 

We have therefore shown that, for every tf in 7, whatever subset of 7 contains f, 
there is an s in S such that h(s) = t. This proves that h is onto. 

Therefore, h is a one-to-one function mapping S onto J, and we conclude that 
|S| = |7. Oo 


Corollary 10.3.6. If S is a subset of F and there exists a function f :T — S that 
is one-to-one, then S and J have the same cardinality. 


Proof. Since S is a subset of 7, |S| < ||. Since there is a one-to-one function 
mapping J into S, |J| < |S|. Then, by the Cantor—-Bernstein Theorem (10.3.5), 
|S| = |7 |. Oo 


The Cantor—Bernstein Theorem can often be used to simplify proofs that given 
sets have the same cardinalities. 


Theorem 10.3.7. Ifa < b, then |[a, b]| = |(a, b)| = |(a, b]| = |La, b)]. 
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Proof. Clearly, |(a,b)| < |[a, b]|. Note that [a + 954, b — 954] is contained in 
(a,b), so |[a+ 44,b- 54] | < |(a,b)|. But, by Theorem 10.2.4, |[a, b]| = 


[a + bea b- ba] [a, b]| < |(a, b)|. So, by the Cantor—Bernstein 


Theorem (10.3.5), |[a, b]| = |(a, b)]. 
The proofs for the half-open intervals are almost exactly the same as the above 
proof for the open interval. oO 


. Therefore, 


What is the cardinality of the set of all real numbers? 


Theorem 10.3.8. The cardinality of the set of all real numbers is the same as the 
cardinality of the unit interval [0, 1]. 


Proof. Let R denote the set of all real numbers. We will “patch together” some of 
the results that we have already proven to show that |R| < |[0, 1] |. 

As we have seen, the set of nonnegative real numbers has the same cardinality 
as [0, 1] (see Theorem 10.2.9). Thus, there exists a one-to-one function f mapping 
the set of nonnegative real numbers onto [0, 1]. The set of negative real numbers 
obviously has the same cardinality as the set of positive real numbers, as can be 
seen by using the mapping that takes x to —x. The positive real numbers can be 
mapped in a one-to-one way into [0, 1]. Since |[0, 1]| — |(3, 4]| (Theorem 10.2.4), 
it follows that the positive real numbers can be mapped in a one-to-one way into 
(3; 4]. Then, using the equivalence of the positive and negative real numbers, we 
conclude that there is a function g mapping the negative real numbers into [3, 4]. 
We now define a function h mapping R into [0, 1] U [3,4] by letting h be f 
on the nonnegative numbers and g on the negative numbers. Then / is a one- 
to-one function mapping R into a subset of [0, 1] U [3, 4], which is a subset of 
[0, 4]. It follows that |R| < |[0, 4]|. On the other hand, [0, 4] is a subset of R, so 
[0, 4]} < |R|, and, by the Cantor—Bernstein Theorem (10.3.5), |R| = |[0, 4] L Since 
[0, 4]| = |[0, 1)| (Theorem 10.2.4), the theorem follows. oO 


There is a theorem that can often be used to provide very easy proofs that sets are 
countable. The next several results form the basis for that theorem and are useful in 
other contexts as well. 


Theorem 10.3.9. A subset of a countable set is countable. 


Proof. Let S be a countable set. If S is finite, then the result is clear. If S 
is infinite, then there exists a one-to-one function, say {, mapping the set of 
natural numbers onto S. Thus, the elements of S can be listed in a sequence, 
(fC), f(2), £3), f(4),...). If So is a subset of S, then the elements of So 
correspond to some of the elements in the sequence. Therefore, the elements of 
So can also be listed in a sequence, and hence So is either finite or has the same 
cardinality as N. Oo 


Corollary 10.3.10. Jf S is any set and there exists a one-to-one function mapping 
S into the set of natural numbers, then S is countable. 
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Proof. Let f be a one-to-one function taking S into N. The range of f is some 
subset 7 of N. Since f is a one-to-one function taking S onto 7, it follows that 
|S| = |7 |. By the previous theorem, 7 is countable, and therefore so is S. oO 


Definition 10.3.11. A finite sequence of elements of a set S is an ordered collection 
of elements of S of the form (51, 52, 53,..., Sx). 


For example, one finite sequence of rational numbers is (- 5 —7, 2 0). 


Theorem 10.3.12. The set of all finite sequences of natural numbers is countable. 


Proof. Let S denote the set of all finite sequences of natural numbers. By the above 
corollary (10.3.10), it suffices to show that there is a one-to-one function g mapping 
S into N. Here is a description of one such function. We define the value of g at each 
given finite sequence of natural numbers to be the number whose digits are 1’s and 
0’s, determined as follows: begin with the number of 1’s equal to the first number in 
the given finite sequence, follow that by a 0, then follow that by the number of 1’s 
equal to the second number in the sequence, then another 0, then the number of 1’s 
corresponding to the third number in the sequence, then a 0, and so on, ending with 
the number of 1’s corresponding to the last number in the sequence. For example, 


g((2,3,7)) = MOLIO1I1 1111 
and 
e((5, 1)) = 1111101 


The function g is one-to-one since the unique sequence corresponding to any 
number in the range of g can be recovered by using the definition of g. For 
example, the number 111101011111101111111101 corresponds to the sequence 
(4, 1, 6, 8, 1). Since g is one-to-one and maps S into N, S is countable. oO 


Corollary 10.3.13. Jf Lis any countable set, then the set of all finite sequences of 
elements of £ is countable. 


Proof. This follows easily from the above theorem. By hypothesis, there exists a 
one-to-one function f mapping £ into N. Then, a one-to-one function F' mapping 
sequences of elements of £ into sequences of elements of N can be obtained by 
defining 


F(a, a2, 43,..., ax) = (f (a1), f (a2), f (a3), ..-, f (ak)) 


Thus, the previous theorem implies the corollary. oO 
The following definition will be useful. 


Definition 10.3.14. Let £and7 be any sets. We will say that T can be labeled by 
£ if there is a one-to-one function from 7 to the set of finite sequences of elements 
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of £. In other words, £ can label J if there is a way of assigning to each element 
of 7 a finite sequence of elements of £ so that no two finite sequences correspond 
to the same element of 7. 


Example 10.3.15. The set N of natural numbers can be labeled by the set of digits 
£ = {0,1,2,3,4,5,6, 7, 8,9}. A given natural number can be labeled by the 
sequence from £ consisting of its digits in the order in which they occur. For 
example, 79288 could be labeled by the sequence (7, 9, 2, 8, 8). 


Example 10.3.16. The set Q of rational numbers can be labeled by the set 
£= {0,1,2,3,4,5, 6,7,8,9,—, /} 


To label a given rational number, first express it in lowest terms and then assign 
to it the sequence from L£ consisting of the minus sign if the number is negative, 
and then listing the digits of the numerator in the order in which they occur, the /, 
and then the digits of the denominator in their order. For example, Ss would be 
labeled by the sequence (7, /, 1, 2,5) and - would be labeled by the sequence 
(—, 2,9, /, 3, 8). 


The following theorem is useful in many situations. It is a slight variant of the 
“Typewriter Principle” that was developed by the mathematician Bjorn Poonen. 


The Enumeration Principle 10.3.17. Every set that can be labeled by a countable 
set is countable. 


Proof. Let J be a set that is labeled by a countable set £. By definition there is 
a one-to-one function mapping 7 into the set of finite sequences of elements of L, 
which is a countable set by Corollary 10.3.13. It follows from Corollary 10.3.10 that 
JF is countable. Oo 


Any set that is proven to be countable by the Enumeration Principle could, of 
course, also be proven to be countable without using this principle. However, the 
Enumeration Principle often leads to very simple proofs. 


Theorem 10.3.18. The set of all rational numbers is countable. 


Proof. As indicated above (Example 10.3.16), the set of rational numbers can be 
labeled by the set £L = {0, 1,2,3,4,5, 6,7, 8,9, —-, /}. Since L£ is finite, the 
Enumeration Principle (10.3.17) gives the result. oO 


You might find the above proof more satisfying than the “zig-zag” proof of the 
fact that the set of positive rational numbers is countable (Theorem 10.1.13). 


Corollary 10.3.19. The set of all integers is countable. 


Proof. A subset of a countable set is countable (Theorem 10.3.9), so this follows 
from the previous theorem (10.3.18). oO 
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Example 10.3.20. The set T = {3 +./m+n™ : mand n are natural numbers} is 
countable. To see this, we use the Enumeration Principle (10.3.17). One possible 
labeling set is £ = NU {+, xf r Since it is the union of a countable set and a 
finite set, £ is countable (Thereom 10.2.10). The element 3 + /m + n™ of J can 
be labeled by the sequence (3, +,./ ,m,+,n, m), so the Enumeration Principle 
gives the result. 


You may have heard the assertion that zr is a “transcendental number”; what does 
that mean? 


Definition 10.3.21. The real number x9 is said to be algebraic if it is the root of a 
polynomial with integer coefficients. The real number xo is said to be transcendental 
if there is no polynomial with integer coefficients that has xo as a root. (There are 
also complex algebraic numbers; see Problem 30 at the end of this chapter.) 


For example, the number -} is algebraic, since it is a root of the polynomial 
m 


4x + 3. More generally, each rational number ~ is algebraic since it is a root of the 


polynomial nx — m. There are also many irrational numbers that are algebraic, such 
as /2, which is a root of the polynomial x? — 2, and (3) 5 which is a root of the 
polynomial 4x° — 3. 

It is not so easy to prove the existence of transcendental numbers. It is well 
known that wz is transcendental, but it is very difficult to prove that fact. It is a 
lot simpler, but still quite difficult, to prove that e, the base of the natural logarithm, 
is transcendental. It is a very surprising and beautiful fact that it is much easier to 
prove that most real numbers are transcendental than it is to prove that any specific 
real number is transcendental. This is a corollary of the following. 


Theorem 10.3.22. The set of real algebraic numbers is countable. 


Proof. We show that the set of real algebraic numbers can be labeled by the 
integers; the Enumeration Principle (10.3.17) then establishes the theorem. Let x9 
be a real algebraic number. We label xo by specifying any polynomial with integer 
coefficients that has xo as a root and then indicating the position that x9 occupies 
among the roots of that polynomial. The details of this labeling are as follows. 

Let a,x" + dy_—jx"—! +...+ a,x +a bea polynomial of degree n with integer 
coefficients that has xg as a root. The first terms of the label assigned to xo are the 
coefficients of the polynomial listed according to the descending powers of x. If, 
for any non-negative integer m less than n, the polynomial does not have a term of 
degree m, then we list 0 as the corresponding coefficient; that is, a,, = 0. The last 
term in the labeling of xo is the natural number k that indicates which position xo 
occupies among all the real roots of the polynomial ordered in the usual way. That 
is, k = 1 if xo is the smallest real root of the polynomial, k = 2 if xo is the second 
smallest real root, and so on. Thus, the label for xo is 


(An, An—1, +++, 41, a0, k) 
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In this manner, every real algebraic number is labeled by a finite sequence of 
integers. Since the set of integers is countable (Corollary 10.3.19), it follows 
from the Enumeration Principle (10.3.17) that the set of real algebraic numbers is 
countable. Oo 


The above easily establishes the existence of transcendental numbers. 
Corollary 10.3.23. The set of real transcendental numbers is uncountable. 


Proof. The set of real algebraic numbers is countable (Theorem 10.3.22). If the set 
of real transcendental numbers was countable, then the set of all real numbers would 
be the union of two countable sets, and therefore countable (Theorem 10.2.10). 
Since the set of all real numbers is uncountable (Theorem 10.3.8 and Theo- 
rem 10.2.3), the set of transcendental numbers is uncountable. 


The cardinality of a finite set consisting of n elements is said to be n. We now 
introduce some standard notation for the sizes of some of the most common infinite 
sets. 


Definition 10.3.24. We say that the set S has cardinality &o (which we read “aleph 
naught’) if the cardinality of S is the same as that of the natural numbers, in which 
case we write |S| = Xo. 


For example, Q| =o. 
There is also a standard notation for the cardinality of the set of real numbers. 


Definition 10.3.25. We say that the set S has cardinality c if the cardinality of S is 
the same as the cardinality of the set of real numbers; c is sometimes said to be the 
cardinality of the continuum. 


For example, |[3, 9]| =c. 

Note that 8o < c, in the sense that every set with cardinality Xo has cardinality 
less than every set with cardinality c. 

It is important to note that No is the smallest infinite cardinality, in the following 
sense. 


Theorem 10.3.26. If S is an infinite set, then ®o < |S|. 


Proof. To establish this, we must show that S has a subset So of cardinality Xo. 
We proceed as follows. Since S is infinite, it surely contains some element, say 51. 
Similarly, S \ {s1} (i.e., the set obtained from S by removing s;) contains some 
element, say s2. Similarly, S \ {s1, 52} contains some element 53. Proceeding in 
this manner creates an infinite sequence (s1, 52, 53,...) of elements of S. Let So = 
{s1, 52, §3,...}. Then clearly |So| = |N| = &o. Since So is a subset of S, it follows 
that Xo < |S]. oO 


Thus, Xo is the smallest infinite cardinality. Is there a largest cardinality? 


Definition 10.3.27. If S is any set, then the set of all subsets of S is called the 
power set of S and is denoted P(S). 
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The terminology “power set of S” comes from the following theorem. (This was 
stated as Problem 5 in Chapter 2.) 


Theorem 10.3.28. If S is a finite set with n elements, then the cardinality of P(S) 
is 2”, 


Proof. First note that this is true for n = 0. For the only set with 0 elements is %, the 
empty set. The empty set has one subset, namely itself. Since 2° = 1, the theorem 
holds for n = 0. 

We proceed by mathematical induction. Suppose that every set with k elements 
has 2* subsets and let S be a set with k + 1 elements. Suppose that so is any element 
of S and let So be the subset S \ {so} of S obtained by removing so. Then So has 
k elements and, by the inductive hypothesis, |P(So)| = 2". Suppose that J is any 
subset of So. Then J is also a subset of S. The set J U {so} is a different subset 
of S. Thus, for each subset 7 of So, there are two subsets of S, 7 and J U {so}. 
It follows that there are twice as many subsets of S as there are subsets of So. 
That is, 


|P(S)| = 2-|P(So)| = 2-24 = 2k 
The theorem follows by mathematical induction. Oo 
What is the relationship between |S| and |P(S)| when S is an infinite set? 
Theorem 10.3.29. For every set S, |S| < |P(S)|. 


Proof. It is easy to see that |S| < |P(S)|, for among the subsets of S are the 
“singleton sets;” i.e., sets of the form {s}, for each s € S. The collection Po of 
all singleton subsets of S is a subset of P(S). A one-to-one function f mapping 
S into P(S) can be defined by f(s) = {s}, for all s in S. Thus, |S| = |Pol, so 
IS] < |P(S)]. 

To show that |S| < |f(S)|, we must show that there does not exist any one-to- 
one function f taking S onto P(S). 

Suppose, then, that f is any function taking S into P(S). We will show that f 
cannot be onto; that is, that there is an element of P(S) (i.e., a subset of S) that is 
not in the range of f/f. 

For each s € S, f(s) is a subset of S. Define the subset So of S by 


So ={seS:s¢ f(s} 


That is, the subset So of S is defined to consist of all of those elements s of S that 
are not in the subset of S that f assigns to s. 

The set So is an element of P(S). We will show that it is not in the range of f. To 
prove this by contradiction, suppose that there was some sg € S such that f (so) = 
So. We show that this is impossible by considering the question: is so in So? We 
will see that either answer to this question leads to a contradiction. 
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Suppose that so ¢ So. The definition of So is that it consists of those elements of 
S that are not in the subsets they are sent to by f. Thus, if so is not in f (so), so is in 
So. In other words, so ¢ So implies so € So, which is a contradiction. 

On the other hand, if so is in So, then the definition of So implies that so is not 
in f (so). But f (so) = So, so so ¢ So. Thus, so € So implies so ¢ So, which is also 
a contradiction. If there was an so satisfying f (so) = So, then so would either be in 
So or not be in So. Therefore, there is no sg satisfying f (so) = So, and the theorem 
is proven. Oo 


Corollary 10.3.30. If S is any set, then there exists a set T whose cardinality is 
greater than that of S. 


Proof. By the previous theorem (10.3.29), 7 = P(S) establishes this corollary. O 


In particular, for the set of real numbers R, the cardinality of P(R), the set of all 
sets of real numbers, is greater than c. Because of the analogy to the case of finite 
sets, it is standard to write |P(IR)| = 2°. 

Similarly, 28° denotes the cardinality of P(N). By the above, Xo < 2*°. Also, as 
we have seen, Xo < c. What is the relationship between 280 and c? 


Theorem 10.3.31. The cardinality of the set of all sets of natural numbers is the 
same as the cardinality of the set of real numbers. That is, |P(N)| = c, or 28° = c. 


Proof. Since |[0,1]| = |R| (Theorem 10.3.8) and |[0, 1]] = |[0,1)| (Theo- 
rem 10.3.7), it suffices to prove that |[0, 1 = |P(N)|. We require the fact that 
numbers in [0, 1) can be represented by infinite decimals; that is, expressions such 
as .dja2a3.... where each q; is a digit between 0 and 9 (this is shown in Chapter 13; 
see Theorem 13.6.3). Some numbers have two such representations. For example, 
.26999 ... = .27000... (see Section 13.6 of Chapter 13). In such cases, choose the 
representation ending in a string of 0’s. 

To show |[0, 1)| < |P(N)|, define the function f from [0, 1) into P(N) by letting 
f (.a1a2a3...) be the subset of N consisting of all natural numbers of the form 
a, 10* + 1. That is, f(.aa2a3...) = {al0K +1:k € N}. Note that the set 
corresponding to an infinite decimal contains at most one number between 10‘ and 
9 - 10 + 1 for each natural number k. It contains such a number when ax is not 0. 
The set contains | if and only if some ax is 0. 

To show that f is one-to-one, suppose that .a}a2a3 ... is not equal to .bjb2b3.... 
Then, for some k, ax ~~ bx. At most one of ax and by is 0. Without loss 
of generality, assume that a, is not 0. Then ay. 10* + 1 is in f(.ajaza3...). 
However, the only number in f(.b,b2b3...) that could be between 10* and 
9-10 + 1 is b, 10 + 1, which is not equal to a, 10* + 1. Therefore f (.aja2a3...) 
is not equal to f(.b1b2b3...). Thus, f is one-to-one, and it follows that 
[[9, D| < IPO). 

We now prove that |P(N)| < |[0, 1]. Define a function g taking P(N) to [0, 1) 
by g(S) = .cjce2c3..., where cj; = lif j ¢ Sandc; = Oif j ¢S. Itis clear that g 
is one-to-one, so |P(N)| < |[0, bj. 

The Cantor—Bernstein Theorem (10.3.5) completes the proof. oO 
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Definition 10.3.32. The unit square in the plane is the subset of the plane consisting 
of all points whose x and y coordinates are both between 0 and 1. That is, the unit 
square is the set 


{@,y):0<x<10<y<1} 


Theorem 10.3.33. The cardinality of the unit square in the plane is c. 


Proof. Let S denote the unit square. It is clear that |S| > c, since S contains the 
subset 


So = {(x,0):0<x<1} 


and there is an obvious pairing of So with [0, 1]. 

To establish the reverse inequality, we will construct a one-to-one function f 
mapping S into [0, 1]. We represent the coordinates of points in the unit square as 
infinite decimals. In ambiguous cases (i.e., where a representation of a number could 
end in either a string of 0’s or a string of 9’s), we choose the representation ending 
in a string of 9’s. We then define the function f by 


f (Ca1aza3 ..., Di bob3.. A) = .ajbyazb2a3b3... 
We claim that f is one-to-one. This follows since f (Gs ”) = .c1c2¢3... implies 
[ 


that x = .cjc3c5... and y = .coc4ce.... Thus, |S| < |[0, 1]], so the Cantor— 
Bernstein Theorem (10.3.5) gives |S| = |[0, 1]]. o 


It can be interesting to determine the cardinality of various sets of functions. 
We present one example below; other examples are given in the problems. The 
following definition will be useful. 


Definition 10.3.34. Let S be a set and So be a subset of S. The characteristic 
function of So as a subset of S is the function f, with domain S, defined by f(s) = 1 
ifs € So and f(s) = Oifs ZSo. 


Note that the range of every characteristic function is contained in the two 
element set {0, 1}. Conversely, a function with domain S whose range is contained 
in {0, 1} is a characteristic function of a subset of S; namely, the set of all those 
s € S that the function takes to 1. 

The following is a very easy, but very useful, fact. 


Theorem 10.3.35. For any set S, the set of all characteristic functions with domain 
S has the same cardinality as P(S). 


Proof. As indicated in the definition above of characteristic function, each subset 
does have a characteristic function. On the other hand, if two characteristic functions 
are equal as functions, they must be characteristic functions of the same subset 
(the subset consisting of all elements of the set on which the functions have 
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value 1). Thus, the correspondence between the set of subsets of S and characteristic 
functions with domain S is one-to-one and onto. oO 


Theorem 10.3.36. The cardinality of the set of all functions mapping (0, 1] into 
[0, 1] is 2°. 


Proof. Among the functions are those that take on values contained in the two 
element set {0,1}; i.e., the characteristic functions with domain [0, 1]. By the 
previous theorem (10.3.35), this set of characteristic functions has cardinality 2°. 
Thus, the set of all functions mapping [0, 1] into [0, 1] has cardinality at least 2°. 
To prove the reverse inequality, we use the fact that every function is determined 
by its graph. The graph of a function f from [0, 1] to [0, 1] is lig f(x) :x € 
[0, 1] \, which is a subset of the unit square. Clearly, every function has a graph, and 
if two functions have the same graphs, then they are the same function. Thus, the set 
of functions we are considering corresponds to a collection of some of the subsets 
of the unit square and hence has cardinality at most equal to that of the set of all 
subsets of the unit square. We have seen (Theorem 10.3.33) that the cardinality of 
the unit square is c. It follows that the cardinality of the set of all subsets of the unit 
square is 2°. Therefore, the cardinality of the set of graphs of functions (and thus of 
the set of functions) is at most 2°. By the Cantor—Bernstein Theorem (10.3.5), the 
cardinality of the set of functions is 2°. Oo 


There are some serious deficiencies in the general approach to set theory that we 
have been describing. The following illustrates some of the problems. 


Cantor’s Paradox. Let S denote the set of all sets. Then every subset of S is an 
element of S, since each subset is a set. That is, P(S) is a subset of S. Hence, 
|\P(S)| < |S|. On the other hand, |S| < |P(S)| (by Theorem 10.3.29). The Cantor— 
Bernstein Theorem (10.3.5) proves that this is a contradiction. 


What does this contradiction mean? If there is a contradiction, then something 
is false; but what? The only assumption that we have made is that there is a set 
consisting of the set of all sets. This contradiction shows that there cannot be such a 
set. To avoid Cantor’s Paradox, the definition of set has to be more restrictive. 

There is another paradox similar to Cantor’s. 


Russell’s Paradox. Define a set to be ordinary if it is not an element of itself. (That 
is, S ¢ S.) All of the sets that we have discussed so far, except for the set of all sets, 
are ordinary sets. Each set is, of course, a subset of itself, but that is very different 
from being a member of itself. (For example, the set of natural numbers is not a 
natural number.) 

Let J denote the set of all ordinary sets. We now ask the question: is 7 an 
ordinary set? If 7 was an ordinary set, then, since 7 is the set of all ordinary sets, 
JF eT. But then J would not be an ordinary set, since it would be an element 
of itself. On the other hand, if Y is not an ordinary set, then 7 € 7. But every 
element of 7 is an ordinary set, so it would follow that 7 is ordinary. That is, if 7 
is ordinary, it is not ordinary; if J is not ordinary, it is ordinary. There cannot be 
such a set. 
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Note that the Cantor and Russell paradoxes are related in the following sense: 
the set of all sets, if it existed, would be a set that is not an ordinary set. 

When mathematicians became aware of the Cantor and Russell paradoxes, over 
a hundred years ago, they were very concerned. Why aren’t “the set of all sets” and 
“the set of all ordinary sets” themselves sets? What other “sets” are not really sets? 

The above and related paradoxes do not arise when considering sets that 
generally arise in doing mathematics. Mathematicians have developed several 
different “axiomatic set theories” in which the concept of “set” is restricted so that 
the Cantor and Russell paradoxes do not arise. In these set theories, there are no 
sets that are elements of themselves. The most popular of the axiomatic set theories 
is called Zermelo—Fraenkel Set Theory. The development of axiomatic set theories 
is fairly complicated and we will not discuss it here. However, the theorems that 
we presented in this chapter are also theorems in Zermelo—Fraenkel Set Theory 
although the formal proofs are slightly different. 

The following is a very natural question: is there any set S whose cardinality 
is greater than Xo and less than c? If there is such a set, there would be a one-to- 
one function taking S into R. Therefore, if there is any such set, then there is a 
subset of R with that property. The question can therefore be reformulated: if S is 
an uncountable subset of R, must the cardinality of S be c? This appears to be a very 
concrete question. It can be made even more concrete, as follows: if S is a subset of 
R and there is no one-to-one function taking S into N, must there exist a one-to-one 
function taking S onto R? 


The Continuum Hypothesis. There is no set with cardinality strictly between Xo 
and c. 


It is very surprising that it is not known whether the Continuum Hypothesis is 
true or false. It is even more surprising that it has been proven that the Continuum 
Hypothesis is an undecidable proposition, in the following sense: it has been 
established that the Continuum Hypothesis can neither be proven nor disproven 
within standard set theories, such as Zermelo—Fraenkel Set Theory. Mathematicians 
disagree about the full implications of this. It is our view that it is possible that some- 
one will prove the Continuum Hypothesis in a way that would convince virtually 
all mathematicians, in spite of it being undecidable within Zermelo—Fraenkel Set 
Theory. That is, someone might begin a proof as follows: “Let S be an uncountable 
subset of IR. We construct a one-to-one function f mapping S onto R by first ....” 
Any such proof would have to use something that was not part of Zermelo—Fraenkel 
Set Theory, since it has been proven that the Continuum Hypothesis cannot be 
decided within Zermelo—Fraenkel Set Theory. On the other hand, it is our opinion 
that it is possible that a proof could be found that would be based on properties of 
the set of real numbers that most mathematicians would agree are true, in spite of 
the fact that at least one of them would not be part of Zermelo—Fraenkel Set Theory. 
However, many mathematicians believe that Zermelo—Fraenkel Set Theory captures 
all the reasonable properties of the real numbers and therefore conclude that no 
such proof is possible. We invite you to try to prove that those mathematicians 
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are wrong by proving (or disproving) the Continuum Hypothesis. Your chance 
of success is extremely low, but you might find it interesting to give it a little 
thought. 


10.4 Problems 


Basic Exercises 


. Show that the set of all polynomials with rational coefficients is countable. 


2. Suppose that the sets S, 7, and UY satisfy S C J C U and that |S| = |W. 


Show that 7 has the same cardinality as S. 


. Let A and B be countable sets. Prove that the Cartesian product of A and B, 


defined by A x B= {(a, b):aEA, be B}, is countable. 


. Assume that |A;| = |B,| and |A2| = | Bo|. Prove: 


(a) |Ay x Az| = |By x Bol. 
(b) If A; is disjoint from Az and B, is disjoint from Bp, then |A; U A2| = 
|B, U Bol. 


. Prove that the half-open intervals [0, 1) and (0, 1] have the same cardinality. 


(This was stated but not proven in Theorem 10.3.7.) 


. What is the cardinality of the set of all functions from N to {1, 2}? 
. What is the cardinality of the set of all numbers in the interval [0, 1] that have 


decimal expansions with a finite number of nonzero digits? 


. Let Q(v2) be the set of real numbers of the form a + b./2, where a and b are 


rational numbers. Find the cardinality of Q(V2). 


Interesting Problems 


. Suppose that S and 7 each have cardinality c. Show that S U J also has 


cardinality c. 


. What is the cardinality of R* = {(x, y)ix,ye R} (the Euclidean plane)? 

. What is the cardinality of the set of all complex numbers? 

. Prove that the set of all finite subsets of Q is countable. 

. Let S and 7 be finite sets and let C = {f : S ~ J} be the set of all functions 


from S to J. Show that if |7| > 1, then |C| > 2!°!. 


. What is the cardinality of the unit cube, where the unit cube is hee y,Z): 


x,y,z €[0, 1]}? 


. What is the cardinality of R* = {(, y,Z):xX,y,z€ R}? 
. What is the cardinality of the set of all functions from {1, 2} to N? 
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17. Find the cardinality of the set of all points in R? all of whose coordinates are 
rational. 

18. Let S be the set of all functions mapping the set [v2. J3, S5, V7 | into Q. 
What is the cardinality of S? 

19. Find the cardinality of the set {(x, y): x €R, y € Q}. 

20. What is the cardinality of the set of all numbers in the interval [0, 1] that have 
decimal expansions that end with an infinite sequence of 7’s? 

21. Let ¢ be a transcendental number. Prove that 4 + 7¢ + 2 is also transcendental. 

22. Suppose that 7 is an infinite set and S is a countable set. Show that SU 7 has 
the same cardinality as 7. 

23. Let S be the set of real numbers f such that cost is algebraic. Prove that S is 
countably infinite. 

24. Let a, b, and c be distinct real numbers. Find the cardinality of the set of all 
functions mapping {a, b, c} into the set of real numbers. 

25. What is the cardinality of 

[nt in,keN | 
i.e., the set of all roots of all natural numbers? 

26. Prove that there does not exist a set with a countably infinite power set. 

27. (This problem requires some basic facts about trigonometric functions.) Find a 
one-to-one function mapping the interval (— 7, 7) onto R. 

Challenging Problems 

28. (a) Prove directly that the cardinality of the closed interval [0, 1] is equal to the 

cardinality of the open interval (0, 1) by constructing a function from [0, 1] 
to (0, 1) that is one-to-one and onto. 

(b) More generally, show that if S is an infinite set and {a, b} C S, then |S| = 
|S \ {a, bj. (The notation S \ {a, b} is used to denote the set of all s in S 
such that s is not in {a, b}.) 

[Hint: Use the fact that S has a countably infinite subset containing a and b.] 

29. Prove that a set is infinite if and only if it has the same cardinality as a proper 
subset of itself. (A proper subset is a subset other than the set itself.) 

30. A complex number is said to be algebraic if it is a root of a polynomial with 
integer coefficients. Prove that the set of all complex algebraic numbers is 
countable. 

31. What is the cardinality of the set of all finite subsets of R? 

32. What is the cardinality of the set of all countable sets of real numbers? 

33. Find the cardinality of the set of all lines in the plane. 

34. Show that the set of all functions mapping R x R into Q has cardinality 2°. 
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35. 


36. 
37. 


38. 


39. 


40. 
. Prove that the set of all sequences of real numbers has cardinality c. 


41 


Prove the following: If xo is a real number and n is the smallest natural number 
such that a polynomial of degree n with integer coefficients has xo as a root, and 
if p and g are polynomials of degree n with integer coefficients that have the 
same leading coefficients (i.e., coefficients of x”) and each have xo as a root, 
then p = q. 

Let S be the set of all real numbers that have a decimal representation using 
only the digits 2 and 6. Show that the cardinality of S is c. 

Let S denote the collection of all circles in the plane. Is the cardinality of S 
equal to c or 2°? 

Prove that if S is uncountable and J is a countable subset of S, then the 
cardinality of S \ T (where S \ F denotes the set of all elements of S that 
are not in J ) is the same as the cardinality of S. 

Find the cardinality of the set of all polynomials with real coefficients. That is, 
find the cardinality of the set of all expressions of the form 


ee or, a 


+++++ajx + a0 

where 7 is a nonnegative integer (that depends on the expression) and ao, 
d,...,Q, are real numbers. 

Prove that the union of c sets that each have cardinality c has cardinality c. 


Chapter 11 ®) 
Fundamentals of Euclidean Plane ml 
Geometry 


In this chapter we describe the fundamentals of Euclidean geometry of the plane. 
Our approach relies, to some extent, on some intuitively obvious properties of 
geometric figures that are apparent from looking at diagrams. More rigorous 
axiomatic treatments of Euclidean geometry are possible. 


11.1 Triangles 


We begin with a few basic concepts. By a line in the plane, we mean a straight line 
that extends infinitely in two directions; by a line segment, we mean the finite part of 
a line between two given points. We assume as an axiom that, given any two points 
in the plane, there exists a unique line passing through the two points. 

Another basic concept is that of a triangle, by which we mean a geometric figure 
consisting of three points (called its vertices) that do not all lie on one line and of 
the line segments joining those points (which are called the sides of the triangle). A 
typical triangle is pictured in Figure 11.1, where its vertices are labeled with capital 
letters. We often refer to the sides of the triangle using notation such as AB (or BA), 
BC, and AC. 


Fig. 11.1 A typical triangle 
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The triangle in Figure 11.1 might be denoted AA BC. The angle at the top of the 
triangle might be denoted either ZA or BAC. When we say that two line segments 
are equal we mean that they have the same length. When we say that two angles are 
equal we mean that they have the same measure (i.e., that one could be placed on 
top of the other so that they coincide). 


Definition 11.1.1. Two triangles are congruent, denoted =, if their vertices can be 
paired so that the corresponding angles and sides are equal to each other. That is, 
AABC = ADEF if ZA = LD, ZB = LE, and ZC = ZF and AB = DE, 
BC =EF,and AC = DF. 


If two triangles are congruent, then one can be placed on top of the other so 
that they completely coincide. More generally, two geometric figures are said to 
be congruent to each other if they can be so placed. It is important to note that 
congruence of triangles can be established without verifying that all of the pairs 
of corresponding angles and all the pairs of corresponding sides are equal to each 
other; as we shall see, equality of certain collections of those pairs implies equality 
of all of them. 

For example, suppose that we fix an angle of a triangle and the lengths of the two 
sides that form the angle. That is, for example, suppose that, in Figure 11.2, we fix 
the angle B and lengths AB and BC. It seems intuitively clear that any two triangles 
with the specified sides AB and BC and the angle B between them are congruent to 
each other; the only way to complete the given data to form a triangle is by joining 
A to C by a line segment. Thus, it appears that any two triangles that have two pairs 
of equal sides and have equal angles formed by those sides are congruent to each 
other. We assume this as an axiom. 


B C 


Fig. 11.2 Illustrating side-angle-side 


The Congruence Axiom 11.1.2 (Side-Angle-Side). If two triangles have two 
pairs of corresponding sides equal and also have equal angles between those two 
sides, then the triangles are congruent to each other. 


We speak of this axiom as stating that triangles are congruent if they have 
“side-angle-side” in common. 


Definition 11.1.3. A triangle is said to be isosceles if two of its sides have the same 
length. The angles opposite the equal sides of an isosceles triangle are called the 
base angles of the triangle. 
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Theorem 11.1.4. The base angles of an isosceles triangle are equal. 


Proof. Let the given triangle be AABC with AB = AC. Turn the triangle 
over and denote the corresponding triangle as AA’C’ B’, as shown in Figure 11.3. 
Then AABC = AA'‘C'’B’. To see this, note that ZA = ZA’ and AB = 
A’C’ = AC = A’B’. Thus, the triangles have side-angle-side in common, 
and are therefore congruent to each other (11.1.2). In this congruence, /B cor- 
responds to ZC’, so ZB = ZC’. On the other hand, /C’ was obtained by 
turning ZC over, and so ZC’ = (CC. It follows that ZB = ZC, as was to be 
proven. Oo 


B Cc Cc’ B’ 


Fig. 11.3 Proving that the base angles of an isosceles triangle are equal 


Definition 11.1.5. A triangle is equilateral if all three of its sides have the same 
length. 


Corollary 11.1.6. All three angles of an equilateral triangle are equal to each 
other. 


Proof. Any two angles of an equilateral triangle are the base angles of an isosceles 
triangle and are therefore equal to each other by the previous theorem. It follows 
that all three angles are equal. Oo 


It is sometimes convenient to establish congruence of triangles by correspon- 
dences other than side-angle-side. 


Theorem 11.1.7 (Angle-Side-Angle). /ftwo triangles have “angle-side-angle” in 
common, then they are congruent. 


Proof. Suppose that triangles ABC and DEF are given with ZA = 2D, AB = 
DE, and ZB = LE. If also AC = DF, then the triangles are congruent by side- 
angle-side (11.1.2). We show that this is the case. If AC is not equal to DF, then 
one of them is shorter; suppose, without loss of generality, that AC is shorter than 
DF. We will show that is impossible. 
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B E 


A ‘ai P D F 


Fig. 11.4 Proving angle-side-angle 


Mark the length DF along AC beginning with the point A and ending at a point 
P, as shown in Figure 11.4. Then draw the line connecting B to P. It would follow 
that AABP has side-angle-side in common with ADE F, so those triangles would 
be congruent (11.1.2). This would imply that ABP = /E. But the hypothesis 
includes the fact that ABC = E. This would give ABC = ZABP, from which 
we conclude that /PBC = 0. Therefore, PB lies on BC and hence AP = AC. 
Thus, AC = DF and the theorem is established. oO 


If two triangles have equal sides, then they automatically also have equal angles. 


Theorem 11.1.8 (Side-Side-Side). [ftwo triangles have corresponding sides equal 
to each other, then they are congruent. 


Proof. Let triangles ABC and DEF be given with AB = DE, BC = EF, and 
AC = DF. At least one of the sides is greater than or equal to each of the other 
two; suppose, for example, that AB is greater than or equal to each of AC and CB 
(the other cases would be proven in exactly the same way). Then place the triangle 
DEF under AABC so that DE coincides with AB as in Figure 11.5. Connect the 
points C and F by a straight line. Since AC = DF, triangle AFC is isosceles 
and the base angles AC F and AFC are equal to each other (by Theorem 11.1.4). 
Similarly, ABC F is isosceles, so BC F = ZBFC. Adding the angles shows that 
LACB = LDFE. It follows that triangles ABC and DE F agree in side-angle-side 


and are therefore congruent to each other (11.1.2). oO 
Cc 
A B 
D E 
F 


Fig. 11.5 Proving side-side-side 


Definition 11.1.9. A straight angle is an angle that is a straight line. That is, the 
angle ABC is a straight angle with vertex B if the points A, B, and C all lie ona 
straight line and B is in between A and C. A right angle is an angle that is half the 
size of a straight angle. 
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Definition 11.1.10. Vertical angles are pairs of angles that occur opposite each 
other when two lines intersect. In Figure 11.6, the angles BEA and CED are a 
pair of vertical angles, and the angles BED and CEA are another pair of vertical 
angles. 


Fig. 11.6 Illustrating vertical angles 


Theorem 11.1.11. Vertical angles are equal. 


Proof. In Figure 11.6, we show that BEA = ZCED, as follows. Angle BE A and 
angle AEC add up to a straight angle. Angle AEC and angle CED also add up toa 
straight angle. Therefore, BEA+ZAEC = LAEC+ZCED. Hence, angle BEA 
equals angle CED. oO 


One customary way of denoting the size of angles is in terms of degrees. 


Definition 11.1.12. The measure of an angle in degrees is defined so that a straight 
angle is 180° and other angles are the number of degrees determined by the 
proportion that they are of straight angles. 

In particular, a right angle is 90°. More generally, if x is the proportion that 
an angle is of a straight angle, then the degree measure of the angle is given 
by 180x. 


We will prove that the sum of the angles of a triangle is a straight angle. In the 
approach that we follow, the following partial result is essential. 


Theorem 11.1.13. The sum of any two angles of a triangle is less than 180°. 


Proof. Consider an arbitrary triangle ABC as depicted in Figure 11.7 (on the next 
page) and extend the side AB beyond A to a point F,, as shown. We will prove that 
the sum of angles CAB and ACB is less than a straight angle. 

Let M be the midpoint of the side AC. Draw the line from B through M 
and extend it to the other side of M to a point D such that DM = MB. 
Draw the line from D to A. Then ZDMA = C/CMB, since they are a 
pair of vertical angles (Theorem 11.1.11). By construction, AM = MC and 
DM = MB. Thus, ACMB = AAMD by side-angle-side (11.1.2). It follows 
that DAM is equal to BCM. Therefore, the sum that we are interested in, 
LBCM + ZMAB, is equal to the sum of DAM + ZMAB. But this latter sum 
is less than a straight angle, since it together with DAF sums to a straight 
angle. oO 
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D Cc 


F A B 


Fig. 11.7 The sum of two angles of a triangle is less than 180 degrees 


11.2 The Parallel Postulate 


Definition 11.2.1. Two lines in the plane are parallel if they do not intersect. 


For hundreds of years, mathematicians tried to prove the “Parallel Postulate” as a 
theorem that followed from the other basic assumptions about Euclidean geometry. 
Finally, in the 1800s, this was shown to be impossible when different geometries 
were constructed that satisfied the other basic assumptions but not the following 
(such geometries are now called “non-Euclidean geometries”). Since it cannot be 
proven, we assume it as an axiom. 


The Parallel Postulate 11.2.2. Given a line and a point that is not on the line, there 
is one and only one line through the given point that is parallel to the given line. 


We will develop necessary and sufficient conditions for two lines to be parallel. 
Given two lines, a third line that intersects both of the first two is said to be a 
transversal of the two lines. Given a transversal of two lines, a pair of angles created 
by the intersections of the transversal with the lines are said to be corresponding 
angles if they lie on the same sides of the given lines. In Figure 11.8, T is a 
transversal of the lines L; and L2. The angles b and d are a pair of corresponding 
angles. The four angles between the parallel lines are called interior angles. If two 
interior angles lie on opposite sides of the transversal, they are called alternate 
interior angles. In Figure 11.8, the angles b and e are a pair of alternate interior 
angles, as are the angles c and f. 


Ly 


Fig. 11.8 Corresponding angles and alternate interior angles 
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Theorem 11.2.3. If the angles in a pair of corresponding angles created by a 
transversal of two lines are equal to each other, then the two lines are parallel to 
each other. 


Proof. If the theorem was not true, then there would be a situation as depicted in 
Figure 11.9, where Za = Zc and lines L; and L? intersect in some point P. Now 
a+ Lb is clearly a straight angle. Then, since Za = Zc, it would follow that that 
the sum of angles b and c is a straight angle, contradicting Theorem | 1.1.13. Hence, 


the lines L; and L2 cannot intersect. oO 
T 
da 
Lo P 
L, 


Fig. 11.9 Equal corresponding angles imply that lines are parallel 


The converse of this theorem is also true. 


Theorem 11.2.4. Jf two lines are parallel, then any pair of corresponding angles 
are equal to each other. 


Proof. Suppose that two lines are parallel and that two corresponding angles 
differ from each other. Then there would be a situation, such as that depicted in 
Figure 11.8, with two parallel lines L; and Lo and angle b different from angle d. 


T 


ii _o 


Ly 


Fig. 11.10 If lines are parallel, corresponding angles are equal 


Suppose that angle b is bigger than angle d (the proof where this inequality is 
reversed would be virtually identical). Then, as depicted above in Figure 11.10, 
we could draw a line L through Q, the point of intersection of LZ; and 7, such 
that angle b’ is equal to angle d. But then, by the previous theorem (11.2.3), L 
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would be parallel to Lz. Thus, L and L; would be distinct lines through the point 
Q which are both parallel to L2, contradicting the uniqueness aspect of the Parallel 
Postulate (11.2.2). oO 


Corollary 11.2.5. Jf two lines are parallel, then any pair of alternate interior 
angles are equal to each other. 


Proof. Consider parallel lines L; and L2 and alternate interior angles b and e as 
pictured in Figure 11.8. From Theorem 11.2.4, we know that angles b and d are 
equal, and, by Theorem 11.1.11, angle d is equal to angle e. Therefore, angles b and 
e are equal. Oo 


We can now establish the fundamental theorem concerning the angles of a 
triangle. 


Theorem 11.2.6. The sum of the angles of a triangle is a straight angle. 


Proof. Let a triangle ABC be given. Use the Parallel Postulate (11.2.2) to pass a 
line through A that is parallel to BC and mark points D and E on opposite sides 
of A on that line, as in Figure 11.11. By Corollary 11.2.5, DAB = ZABC and 
LEAC = LACB. Clearly, the sum of the angles DAB, BAC, and EAC is a straight 
angle. Therefore, the sum of the angles ABC, BAC, and ACB is also a straight 
angle. Oo 


B Cc 


Fig. 11.11 The sum of the angles of a triangle is a straight angle 


The following is an obvious corollary. 


Corollary 11.2.7. [ftwo angles of one triangle are respectively equal to two angles 
of another triangle, then the third angles of the triangles are also equal. 


Corollary 11.2.8 (Angle-Angle-Side). [f two triangles agree in angle-angle-side, 
then they are congruent. 


Proof. By the previous corollary (11.2.7), the triangles have their third angles equal 
as well. Thus, the triangles also agree in angle-side-angle and, by Theorem 11.1.7, 
they are congruent. Oo 


We finish this section with a useful fact that we will need later in the chapter. 
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Definition 11.2.9. Lines, or line segments, are said to be perpendicular (or orthog- 
onal) if they intersect in a right angle. 


Lemma 11.2.10. Jf two lines are parallel and two other lines are perpendicular to 
the parallel lines, then the lengths of the perpendicular line segments between the 
parallel lines are equal to each other. 


Proof. In Figure 11.12, we are assuming that L; is parallel to Lz and that AB 
and CD are perpendicular to both of L; and Lz. (By Theorem 11.2.4, if a line 
is perpendicular to one of two parallel lines, it is perpendicular to the other as well.) 
We must prove that AB = CD. Note that ACB = ZDBC and ZABC = ZBCD, 
by Corollary 11.2.5. Thus, the triangles ABC and BCD are congruent by angle- 
side-angle (11.1.7), since they also share the side BC. Therefore, the corresponding 


sides AB and CD are equal to each other. Oo 
4 : 
ie I 
B D 


Fig. 11.12 Perpendiculars between parallel lines are equal 


11.3 Areas and Similarity 


We require knowledge of the areas of some common geometric figures. 

Recall that a rectangle is a four-sided figure in the plane all of whose angles are 
right angles. By Theorem 11.2.3, this means that the opposite sides of a rectangle 
are parallel. Lemma 1 1.2.10 then implies that the opposite sides of a rectangle must 
have the same length. 

The following definition forms the basis for the definition of areas of all 
geometric figures. 


Definition 11.3.1. The area of a rectangle is defined to be the product of the lengths 
of two of its adjacent sides. 


The areas of other geometric figures can be obtained either by directly comparing 
them to rectangles or by approximating them by rectangles. 


Definition 11.3.2. A right triangle is a triangle one of whose angles is a right angle. 
The side opposite the right angle in a right triangle is called the hypotenuse of the 
triangle, and the other two sides are called the legs. 


Theorem 11.3.3. The area of a right triangle is one-half the product of the lengths 
of the legs of the triangle. 
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Proof. Let the right triangle AABC be as pictured in Figure 11.13, where /C is a 
right angle. By creating perpendiculars to AC at A and to BC at B, complete the 
triangle to a rectangle as shown. We will prove that triangles ACB and BDA are 
congruent and thus have equal areas, which must be half of the area of the rectangle. 


A D 


Fig. 11.13 Area of a right triangle 


Since the sum of the angles of a triangle is 180° (Theorem 11.2.6), the sum of 
angles BAC and ABC is 90°. Since AD is perpendicular to AC, the sum of the 
angles BAC and BAD is also 90°. Hence, ZABC = ZBAD. Similarly, since BD 
is perpendicular to BC, CAB = ZABD. It follows that AABC = ABAD, since 
they agree in angle-side-angle (11.1.7). Thus, those triangles have equal areas. Since 
their areas sum to the area of the rectangle whose area is the product of the legs of 
the triangle ABC, it follows that the area of the triangle ABC is one-half of that 
product. oO 


Definition 11.3.4. Any one of the sides of a triangle may be regarded as a base of 
the triangle. If a side of a triangle is designated as its base, then the height of the 
triangle (relative to that base) is the length of the perpendicular from the base to 
the vertex of the triangle not on the base. It may be necessary to extend the base 
of the triangle in order to determine its height, as in the second triangle pictured in 
Figure 11.14. (In both of the triangles depicted in Figure 11.14, 4 is the height of the 
triangle to the base AC.) 


Theorem 11.3.5. The area of any triangle is one-half the product of the length of a 
base of the triangle and the height of the triangle to that base. 


_ 
4 
A D Cc 


Fig. 11.14 Heights of triangles 
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Proof. Suppose that the triangle ABC is as pictured in the first triangle in 
Figure 11.14, where h is the height to the base AC. Then, by the previous 
theorem (11.3.3), the area of the right triangle A B D is one-half the product of h and 
AD, and the area of the right triangle DBC is one-half the product of h and DC. 
The area of triangle ABC is the sum of those areas and is therefore sh - (AD) + 
sh - (DC) = sh: (AD + DC) = 5h- (AC). This finishes the proof in this case. 
Suppose that AABC is as pictured in the second triangle in Figure 11.14. 
The side AC had to be extended to the point D at the bottom of the height. 
In this case, the area of AABC is the difference between the area of the 
right triangle BDC and the area of the right triangle BDA. Hence, the area is 
sh - (DC) — $h- (DA) = 5h- (AC). Oo 


One of the most famous theorems in mathematics is the Pythagorean Theorem. 
There are very many known proofs of this theorem. One of the nicest, in our view, 
is the one presented below. 


The Pythagorean Theorem 11.3.6. For any right triangle, the square of the length 
of the hypotenuse is equal to the sum of the squares of the lengths of the legs. 


Proof. Let the right triangle have legs of lengths a and b and hypotenuse of length c. 
Place four copies of the given right triangle inside a square whose sides have length 
a+b, as shown in Figure 11.15. We need to prove that the four-sided figure DE FG 
is a square; i.e., since each of its sides has length c, we must prove that each of its 
angles is a right angle. 


a 6G b 


Fig. 11.15 Proof of the Pythagorean Theorem 


Since the sum of the angles of a triangle is 180° (Theorem 11.2.6), the sum of 
the two non-right angles in a right triangle is 90°. Since each angle of DE FG sums 
with the two non-right angles of the triangle to a straight angle, it follows that each 
angle of DE FG is 90°. Thus, DE FG is a square, each of whose sides has length 
c. The area of the big square, each of whose sides has length a + b, is the sum of 
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the area of the square DE FG and of the areas of four copies of the original right 
triangle. That is, (a + b)? = 4(5ab) + c?. Thus, a? + 2ab + b? = 2ab + c*, or 
at+h=c*. oO 


Definition 11.3.7. Two triangles are similar if their vertices can be paired so that 
their corresponding angles are equal to each other. We use the notation AABC ~ 
ADE F to denote similarity. 


Of course (by Corollary 11.2.7), it follows that two triangles are similar if 
they agree in two of their angles. It is an important, and nontrivial, fact that the 
corresponding sides of similar triangles are proportional to each other. In other 
words, if AABC ~ ADEF, then ia = ie = ge The ingenious proof that we 
present goes back to Euclid. Our basic approach is based on the following lemma. 


Lemma 11.3.8. /f a triangle with area S, has the same height with respect to a 


base b, that a triangle with area Sz has with respect to a base b2, then a = 2. 


Proof. Let the common height of the two triangles with respect to the given bases 


be h. Then, S$; = ib; and Sy = $hbp. It follows that jt = Sh = @. o 


Theorem 11.3.9. Jf two triangles are similar, then their corresponding sides are 
proportional. That is, if AABC ~ ADEF, then ah a ve a Re 


Proof. It suffices to prove that $e = ae; the other equation can be obtained as in 


the proof below but placing the triangles so that the angle at B coincides with the 
angle at E. 

Place the triangles so that the angle of the first triangle at A coincides with the 
angle of the second triangle at D. If the length of AB is the same as the length of 
DE, then the two triangles are congruent and all the proportions are 1. Assume, 
then, that the length of AB is less than the length of DE. (If the opposite is true, 
the proof below can be accomplished by interchanging the roles of AABC and 
ADEF.) The situation is depicted in Figure 11.16. 


Fig. 11.16 Corresponding sides of similar triangles are proportional 


We need to construct triangles to which we can apply the preceding lemma. 
In Figure 11.16, connect B and F by a line and C and E by a line. Note that, 
by Theorem 11.2.3, ABC = /DEF implies that the line BC is parallel to 
the line EF. Regard the triangles BEC and BFC as having a common base 
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BC. Then the corresponding heights of the triangles are the perpendiculars from 
E to (the extension of) BC and from F to (the extension of) BC, respectively. 
By Lemma 11.2.10, those heights are equal to each other. Thus, triangles BEC and 
BFC, having equal bases and heights, have equal areas. Adding those triangles to 
AABC establishes that triangles ACE and AB F have equal areas. 

We can now use Lemma 11.3.8, as follows. Since AABC has the same 
height with respect to its base AB as AACE has with respect to its base DE, 
Lemma 11.3.8 implies that 


area(A ABC) area(A AC E) AB area(A ABC) 
=> 5» Or 


AB DE DE  area(AACE) 


Similarly, AABC has the same height with respect to its base AC as AABF has 
with respect to its base DF, so 


area(AA BC) _ area(A AB F) AC area(A ABC) 


AC DF "DF  area(AABF) 


Since triangles ABF and ACE have the same area, it follows that ie = ae. oO 


Definition 11.3.10. An angle inscribed in a circle is an angle whose sides each 
connect two points of the circle and whose vertex is a point on the circle. (In 
Figure 11.17, the angle BAC is inscribed in the circle.) The part of the circle that is 
opposite the angle and lies between the sides of the angle is called the arc cut off by 
the angle (or the arc intercepted by the angle). 


Fig. 11.17 An inscribed angle 


We will need the following result in Chapter 12. 


Theorem 11.3.11. [fan angle is inscribed in a circle and the arc that it cuts off is 
a semicircle, then the angle is a right angle. 


Proof. The angles that we are considering are those angles such as BAC in 
Figure 11.18, where BC is the diameter of the circle and O is the center of the 
circle. Draw the diameter from A through O. 
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Since OA and OB are radii of the given circle, and thus are equal, it follows that 
LOAB = LOBA (Theorem 11.1.4). Similarly, ZO0AC = ZOCA. Thus, we can 
label the angles x, y, z, w as indicated in Figure 11.18. 


B 


Cc 


Fig. 11.18 An inscribed angle that cuts off a semicircle is a right angle 


Since 2x + w = 180° and z+ w = 180°, it follows that z = 2x. Similarly, 
w = 2y. Therefore, 2x +2y = z+ w = 180°, sox + y = 90°. This shows that 
LBAC is 90°. oO 


11.4 Problems 
Basic Exercises 


1. Which of the following triples cannot be the lengths of the sides of a right 


triangle? 
(a) 3,4,5 (c) 2,3,4 
(b) 1,1,1 (d) 1,V3,2 


2. In the diagram given below, lines L; and Ly are parallel and line T is a 
transversal. If the measure of 4d is 55° and the measure of Zf is 130°, find 
the measures of Zb, Ze, and Zc. 


Ly 


11.4 Problems 129 


3. In the diagram given below, the line segment BD is perpendicular to the line 
segment AC, the length of AM is equal to the length of MC, the measure of 
ZC is 35°, and the measure of ZF AD is 111°. 


D C 


(a) Prove that triangle ABM is congruent to triangle CBM. 
(b) Find the measure of ZC AB. 

(c) Find the measure of ZABC. 

(d) Find the measure of ZABD. 

(e) Find the measure of ZAM D. 

(f) Find the measure of ZD. 

(g) Show that the line segments AD and BC are not parallel. 


4. Prove that two right triangles are congruent if a leg of one of the triangles has 
the same length as one of the legs of the other triangle and the lengths of their 
hypotenuses are equal. 


Interesting Problems 


5. A quadrilateral is a four-sided figure in the plane each of whose interior angles 
is less than 180°. Prove that the sum of the angles of a quadrilateral is 360°. 

6. For quadrilateral ABC D, as shown below, suppose that ABC = ZCDA and 
£DAB = LBCD. Prove that AB = CD and BC = AD. 


B Cc 


130 


ve 


8. 


10. 


11. 


12. 
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Prove that if two angles of a triangle are equal, then the sides opposite those 
angles are equal. 

With reference to the diagram below, prove that AA BC is an isosceles triangle 
if DAB = ZEAC and DE is parallel to BC. 


D A E 


B Cc 


. A parallelogram is a four-sided figure in the plane whose opposite sides are 


parallel to each other. Prove the following: 


(a) The opposite sides of a parallelogram have the same length. 

(b) The area of a parallelogram is the product of the length of any side and the 
length of a perpendicular to that side from a vertex not on that side. 

(c) If one of the angles of a parallelogram is a right angle, then the parallelo- 
gram is a rectangle. 


A trapezoid is a four-sided figure in the plane two of whose sides are parallel 
to each other. The height of a trapezoid is the length of a perpendicular 
from one of the parallel sides to the other. Prove that the area of a trapezoid 
is its height multiplied by the average of the lengths of the two parallel 
sides. 

A square is a four-sided figure in the plane all of whose sides are equal to each 
other and all of whose angles are right angles. The diagonals of the square 
are the lines joining opposite vertices. Prove that the diagonals of a square are 
perpendicular to each other. 

Show that lines are parallel if there is a transversal such that the alternate interior 
angles are equal to each other. 


Challenging Problems 


13. 


14. 


Give an example of two triangles that agree in “angle-side-side” but are not 
congruent to each other. 

Find the length of line segment OA in the diagram below (line segment OD is 
a radius of the circle centered at O, line segment AD has length 4, line segment 
AC has length 10, and OABC is a rectangle). 
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C B 


15. Prove the converse of the Pythagorean Theorem; i.e., show that if the lengths of 
the sides of a triangle satisfy the equation a* + b” = c’, then the triangle is a 
right triangle. 

16. The following problem establishes some basic results in trigonometry. 


(a) Let @ be any angle between 0 and 90°. Place @ in a right triangle, as shown 
in the diagram below, and label the sides as in the diagram. Define sin @ to 
be £, cos to be B and tan@ to be $. Using Theorem 11.3.9, show that 
these definitions do not depend on which right triangle a given angle @ is 
placed in. 


c 
a 
é 
b 
(b) Label the angles of a triangle with A, B, and C and label the side opposite 


ZA with a, the side opposite ZB with b, and the side opposite ZC with c. 
Prove that, in the case where the angles A, B and C are all less than 90 
degrees: 


b 
@) — = —°_ = _*_ (The Law of Sines) 

sin A sin B sin C 
(ii) c? = a? +b? — 2abcos C (The Law of Cosines) 


(The functions sine, cosine and tangent can also be defined for angles 
greater than 90 degrees. The Law of Sines and the Law of Cosines hold 
for such angles as well.) 


17. (This problem generalizes the result of Theorem 1 1.3.11.) It is sometimes useful 
to have a measure of an arc of a circle. One common such measure is in terms of 


132 
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degrees. The measure of a full circle is defined to be 360 degrees. The number 
of degrees in any arc of a circle is defined to be the product of 360 and the length 
of that arc divided by the circumference of the circle. 

Prove that the measure of an angle inscribed in a circle is one-half the 
measure of the arc cut off by the angle. That is, in the diagram below, the number 
of degrees of BAC is half the number of degrees in the arc BC. 

[Hint: One approach is to first prove the special case where AC is a diameter of 
the circle. ] 


Chapter 12 ®) 
Constructibility om 


The Ancient Greeks were interested in many different kinds of mathematical 
problems. One of the aspects of geometry that they investigated is the question of 
which geometric figures can be constructed using a compass and a straightedge. 
A compass is an instrument for drawing circles. The compass has two branches that 
open up like a scissors. One of the branches has a sharp point at the end and the 
other branch has a pen or pencil at the end. If the compass is opened so that the 
distance between the two ends is r and the pointed end is placed on a piece of paper 
and the compass is rotated about that point, the writing end traces out a circle of 
radius r. The drawing made by any real compass will only approximate a circle of 
radius r. But we are going to consider constructions theoretically; we will assume 
that a compass opened up a distance r precisely makes a circle of radius r. 

To do geometric constructions, we will also require (as the Ancient Greeks 
did) another implement. By a straightedge we mean a device for drawing lines 
connecting two points and extending such lines as far as desired in either direction. 
Sometimes people inaccurately speak of constructions with “ruler and compass.” 
It is important to understand that the constructions investigated by the Ancient 
Greeks do not allow use of a ruler in the sense of an instrument that has distances 
marked on it. We can only use such an instrument to connect pairs of points by 
straight lines; we cannot use it to measure distances. 

In this chapter, when we say “construct” or “construction,” we always mean 
“using only a compass and a straightedge.” 

We begin by showing how to do some basic constructions. But the most 
interesting part of this chapter will be proving that certain geometric objects cannot 
be constructed using a straightedge and compass. In particular, we will prove that 
an angle of 20° cannot be constructed. This implies that an angle of 60° cannot 
be trisected (i.e., divided into three equal parts) with a straightedge and compass. 
The Ancient Greeks assumed that there must be some way of trisecting every angle; 
they thought that they had simply not been clever enough to find a method for doing 
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so. It was only after mathematical advances in the nineteenth century that it could 
be proven that there is no way to trisect an angle of 60° with a straightedge and 
compass. The highlight of this chapter will be a proof of that fact. Although it is 
hard to imagine how something like that could be proved, we shall see that there is 
an indirect approach that also establishes many other interesting results. 


12.1 Constructions with Straightedge and Compass 


Let’s start with some very basic constructions. 


Definition 12.1.1. A perpendicular bisector of a line segment is a line that is 
perpendicular to the line segment and goes through the middle of the line segment. 


Theorem 12.1.2. Given any line segment, its perpendicular bisector can be con- 
structed. 


Proof. Given a line segment AB, as shown in Figure 12.1, put the point of the 
compass at A and open the compass to radius the length of AB. Let r equal the 
length of AB. Then draw the circle with center at A and radius r. Similarly, draw 
the circle with center at B and radius r. The two circles will intersect at two points; 
label them C and D as indicated in Figure 12.1. Take the straightedge and draw the 
line segment from C to D. We claim that C D is a perpendicular bisector of AB. 


C 


D 


Fig. 12.1 Constructing the perpendicular bisector of a line segment 


To prove this, label the point of intersection of CD and AB as E and then draw 
the line segments AC, CB, BD, and DA. We must prove that AE = EB and that 
/CEA (and/or any of the other three angles at F) is a right angle. First note that 
AC, CB, BD, and DA all have the same length, r, since they are all radii of the 
two circles of radius r. Thus, triangle AC D is congruent to triangle BC D, since the 
third side of each is C D and they therefore agree in side-side-side (11.1.8). It follows 
that ACE = /BCE. Hence, triangle AC E is congruent to triangle BC E by side- 
angle-side (11.1.2). Therefore, AE = EB. Moreover, AEC = /BEC, so, since 
those two angles sum to a straight angle, each of them is a right angle. oO 


Definition 12.1.3. A bisector of an angle is a line from its vertex that divides the 
angle into two equal subangles. 
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Theorem 12.1.4. Given any angle, its bisector can be constructed. 


Fig. 12.2 Constructing the bisector of an angle 


Proof. Consider an angle ABC, as pictured in Figure 12.2, and draw a circle 
centered at B that intersects both BA and BC. Label the points of intersection of 
the circle with AB and with BC as E and F, respectively. Let r be the distance 
from E to F. Use the compass to draw a circle of radius r centered at F and a circle 
of radius r centered at F. These two circles intersect in some point G within the 
angle ABC, as shown in Figure 12.2. Use the straightedge to draw the line segment 
connecting B to G. We claim that this line segment bisects the angle ABC. 

To see this, draw the lines EG and FG. We prove that triangle B EG is congruent 
to triangle BF'G. Note that BE = BF, since they are both radii of the original circle 
centered at B. Note also that EG = FG, since they are each radii of circles with 
radius r. Since triangle B EG and triangle B FG share side BG, it follows from side- 
side-side (11.1.8) that the two triangles are congruent. Hence, /EBG = /FBG, 
and BG is a bisector of angle ABC. oO 


Theorem 12.1.5. Any given line segment can be copied using only a straightedge 
and compass. 


Se 
we 
Q 
S 


Fig. 12.3. Copying a line segment 


Proof. Suppose a line segment AB is given, as pictured in Figure 12.3, and it is 
desired to copy it on another line. Choose any point C on the other line, and then 
open the compass to a radius the length of AB. Put the point of the compass at C 
and draw any portion of the resulting circle that intersects the other line. Label the 
point of intersection D. Then CD is copy of AB. oO 
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Theorem 12.1.6. Any given angle can be copied using only a straightedge and 
compass. 


Fig. 12.4 Copying an angle 


Proof. Let an angle ABC be given, as in Figure 12.4. We construct an angle equal 
to ZABC with vertex G on any other line. To do this, draw any arc of any circle 
(of radius, say, r) centered at B that intersects both BA and BC. Label the points 
of intersection D and E. Draw the circle of radius r centered at G. Use H to label 
the point where that circle intersects the line containing G. Then adjust the compass 
to be able to make circles of radius DE. Put the point of the compass at H and 
draw a portion of the circle that intersects the circle centered at G; call that point of 
intersection 7. Draw line segments connecting D to E and J to H. 

Then /H = DE, since IA is aradius of a circle with radius DE. Also draw the 
line segment G/. The lengths of BD, BE, GI, and GH are all equal to r. It follows 
by side-side-side (11.1.8) that triangle BDE is congruent to triangle GJ H. Thus, 
LIGH is acopy of ZABC. o 


It is sometimes necessary to erect a perpendicular at a given point on a line. 


Theorem 12.1.7. If P is a point on a line, then it is possible to construct a 
perpendicular to the line that passes through P. 


Proof. Construct a right angle (by, for example, constructing the perpendicular 
bisector of a line segment, as in Theorem 12.1.2). Then copy the right angle so 
that its vertex is at P and one side of the angle is the given line. The other side of 
the angle is then a perpendicular, as desired. Oo 


Similarly, a perpendicular to a given line can be “dropped” from a point not on 
the line. 


Theorem 12.1.8. Jf P is a point that is not on a given line, then it is possible to 
construct a line through P that is perpendicular to the given line. 


Proof. To drop the perpendicular, make a circle with center at the point P whose 
radius is large enough that it intersects the line in two points, A and B, as depicted 
in Figure 12.5. Draw the line segments connecting A to P and B to P. Next, 
bisect the angle APB (Theorem 12.1.4). The resulting triangles, one on either 
side of the angle bisector, are congruent to each other since they agree in side- 
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angle-side (11.1.2). This implies that the two angles the angle bisector makes with 
the original line are equal to each other, and, since they sum to a straight angle, they 
are therefore each 90 degrees. Hence, the angle bisector is a perpendicular from the 
point P to the given line. Oo 


A B 


Fig. 12.5 Dropping a perpendicular from a point to a line 


Theorem 12.1.9. [f the angles a and B are constructed, then: 


(i) the angle a + B can be constructed, and 
(ii) for every natural number n, the angle na can be constructed. 


Fig. 12.6 Constructing the sum of two angles 


Proof. (i) Let the angles aw and £ be given, as pictured in Figure 12.6. To construct 
the angle w + 8, simply copy the angle a with one side DE and the other side above 
the original angle £, as shown in the third diagram in Figure 12.6. 

(ii) This clearly follows from repeated application of part (i), starting with angles 
a and f that are equal to each other. (This can be proven more formally using 
mathematical induction.) oO 


Theorem 12.1.10. Given any line segment and any natural number n, the line 
segment can be divided into n equal parts using only a straightedge and compass. 


Proof. Fix a natural number n. Let a line segment AB be given, as shown in 
Figure 12.7. Use the straightedge to draw any line segment emanating from A that is 
at a positive angle less than 90 degrees with AB, and pick a point C on it, as shown. 
Open the compass to any radius s. Beginning at A, use the compass to mark off n 
consecutive segments of AC of length s, as illustrated in Figure 12.7. (If the length 
of AC is less than ns, it is necessary to extend AC to make it greater than ns.) Label 
the points of intersection of the first n — 1 arcs and AC as Pj, Po, P3,..., Py—1. 
Label the point of intersection of the line and the n™ arc as D. Use a straightedge to 
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Fig. 12.7 Dividing a line segment into n equal parts 


connect D to B. We will construct lines parallel to DB through each point P;, after 
which we will show that the intersections of those lines with AB divide AB into n 
equal segments. 

To construct the parallel lines, copy the angle ADB (Theorem 12.1.6) at each 
point P; so that one side of the new angle lies on AD and the other side points 
downwards and is extended to intersect the line AB. These are the dotted lines in 
Figure 12.7; they are parallel by Theorem 11.2.3. Label the points of intersection of 
the dotted lines with AB as Q1, Q2, Q3,..., Qn—1, aS shown. 

We claim that the points Q;, Q2, Q3,..., Qn—1 divide the segment AB into n 
equal parts. To see this, note that, for each j, the triangle AP; Qj; has two angles, 
£P;AQj; and ZAP; Qj, equal to corresponding angles of AADB. Thus, AAP; Q; 
is similar to AADB (Corollary 11.2.7). Therefore, the corresponding sides are 
proportional (Theorem 11.3.9). For each j, the ratio of AP; to AD is i Thus, 


the length of AQ; divided by the length of AB is also Z Oo 


Therefore, a line segment can be divided into any number of equal parts using 
only a straightedge and compass. The situation is quite different with respect to 
angles. In particular, some angles, such as those of 60 degrees, cannot be divided 
into three equal parts using only a straightedge and compass. We now begin 
preparation for an indirect approach to establishing that fact. 


12.2 Constructible Numbers 


We now consider constructing numbers on a number line instead of constructing 
geometric objects. 

We begin by imagining a horizontal line on which a point is arbitrarily marked 
as 0 and another point, to the right of it, is arbitrarily marked as 1. We consider the 
question of what other numbers can be obtained by starting with the length 1 (that 
we take as the distance between the points marked 0 and 1) and doing geometric 
constructions in the plane to obtain other lengths. By a geometric construction, we 
mean using our straightedge to make lines joining any two points we have already 
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marked (i.e., constructed) or using our compass to construct a circle centered at a 
constructed point using a radius that has previously been constructed. 


Definition 12.2.1. A real number is constructible if the point corresponding to it 
on the number line can be obtained from the marked points 0 and | by performing 
a finite sequence of geometric constructions in the plane using only a straightedge 
and compass. 


Theorem 12.2.2. Every integer is constructible. 


Proof. The numbers 0 and | are given as constructible. The number 2 can easily 
be constructed: simply take a compass, open it up to radius | by placing one side 
at the point 0 and the other side at the point 1, and then place the pointed side on 
the point marked 1 and draw the circle of radius | with that point as center. The 
point where that circle meets the number line to the right of | is the number 2, so 
2 has been constructed. Then clearly 3 can be constructed by placing the compass 
with radius | so as to make a circle centered at 2. Similarly, all the natural numbers 
can be constructed. To construct the number —1, simply make the circle of radius 1 
centered at 0 and mark the intersection to the left of 0 of that circle with the number 
line. Then —2 can be constructed by marking the point where the circle centered at 
—1 meets the number line to the left of the point —1. Every negative integer can be 
constructed similarly. oO 


What about the rational numbers? 
Theorem 12.2.3. Every rational number is constructible. 


Proof. To construct, for example, the number 7 simply divide the interval between 
0 and | into three equal parts (see Theorem 12.1.10) and mark the right-most point 
of the first part as 7 Similarly, for any natural number n, dividing the unit interval 
into n equal parts shows that i is constructible. Then, for any natural number m, 
“ can be constructed by copying m segments of length ‘ next to each other on the 
number line, with the first of those segments beginning at 0. 

We have therefore shown that all of the positive rational numbers are con- 
structible. If x is a negative rational number, construct |x| and then make a circle 
of radius |x| centered at 0; the point to the left of 0 where that circle intersects the 
number line is x. Thus, every rational number is constructible. oO 


We need to get information about the set of all constructible numbers. It is essen- 
tial to the development of this approach that doing arithmetic with constructible 
numbers produces constructible numbers. 


Theorem 12.2.4. [fa is constructible, then —a is constructible. 


Proof. Place a compass on the number line with its point at 0 and the other end 
opened to a. Then draw the circle. The number —a will be the point of intersection 
of the circle and the number line opposite to that of a. (If a is positive, then —a is 
negative, but if a is negative, then —a is positive.) oO 
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Theorem 12.2.5. The sum of two constructible numbers is constructible. 


Proof. Suppose that a and b are constructible. If b = 0, then clearly a+b =a+0= 
a is constructible. So assume that b 4 0. Open the compass to radius |b], place the 
point of the compass on the number line at a, and draw the circle. If b is positive, 
then a + b will be the point of intersection of the circle and the number line to 
the right of a. If b is negative, then a + b will be the point of intersection of the 
circle and the number line to the left of a. In both cases, this proves that a + b is 
constructible. o 


We also need to construct products and quotients. These constructions are a little 
more complicated; we begin with the following. 


Theorem 12.2.6. If a and b are positive constructible numbers, then § is 
constructible. 


Proof. We consider two possible cases, that where b is greater than | and that where 
b is less than 1. 


Fig. 12.8 Constructing quotients (case b > 1) 


For the case where b is greater than 1, mark the numbers 0, 1, and b on the 
number line. Use the straightedge to draw a line segment of length greater than a 
starting from 0, making any angle greater than 0° and less than 90° with the number 
line, as pictured in Figure 12.8. Since a is constructible, we can open the compass 
to radius a. Place the point of the compass at 0 and mark a on the line above the 
number line. Use the straightedge to connect the point a on the new line to the point 
b on the number line. 

Let 6 be the angle at b between this line and the number line, and to the left of 
this line, as shown. Copy the angle £ to the point 1 on the number line so that the 
lower side of the angle is the number line itself. Use the straightedge to extend the 
other side of the angle beyond the new line. The intersection of the other side of the 
angle and the new line is a point that we have thereby constructed. Let the distance 
from the origin to that point be x. We can open the compass to radius x and thereby 
mark x on the number line. So x is a constructible number. The relationship between 
x and a and b can be determined by observing that the two triangles formed by the 
above construction are similar to each other, and therefore the corresponding sides 
are in proportion (Theorem 11.3.9). It follows that - = i Thus, x = - so we have 
constructed 5. 
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Fig. 12.9 Constructing quotients (case b < 1) 


The case where b is less than | is very similar. In this case, | is to the right of 
b on the number line. Use the straightedge to make a side of an angle starting at 0 
above the number line. Since a is constructible, we can open the compass to radius 
a and mark a point on the new line that is distance a from the vertex of the angle, 
as in Figure 12.9. Then use the straightedge to draw a straight line between that 
point and the point b on the number line. Copy the angle, 8, at the point b on the 
number line to the point 1 on the number line and extend the side of the angle so 
that it intersects the other line. The compass can then be opened to radius equal 
to the distance from that point of intersection to the origin. If x denotes that radius, 
then the fact that the corresponding sides of similar triangles are proportional gives 
a 


a - so that x = $. Thus, ¢ is constructible. Oo 


The above easily leads to the result that products and quotients of constructible 
numbers are constructible. 


Corollary 12.2.7. If a and b are constructible numbers, then ab is constructible 
and, if b # 0, § is constructible. 


Proof. First, suppose that a and b are both positive. Then ¢ is constructible by 


the previous theorem (12.2.6). Let c = ‘i then c is constructible by the previous 
theorem using a = 1. Since c is constructible, the previous theorem implies that 4 


is constructible. But ¢ = 74, = ab, so ab is constructible. 
b 
If one or both of a and b is negative, the above can be applied to |a| and |b|. Then 
ab = |a|- |b| if a and b are both negative, and ab = —|a| -|b| if exactly one of them 
is negative. Similarly, ¢ is equal to one of itt or -q. Since we can construct the 


negative of any constructible number (Theorem 12.2.4), it follows that ab and ¢ are 
constructible in this case as well. Oo 


A “field” is an abstract mathematical concept. In this book we do not need to 
consider general fields; we only need to consider subfields of R. The following 
definition forms the basis for the rest of this chapter. 


Definition 12.2.8. A subfield of R is a set f of real numbers satisfying the 
following properties: 


(i) The numbers 0 and 1 are both in F. 
(ii) If x and y are in Ff, then x + y and xy are in F (i.e., F is “closed under 
addition” and “closed under multiplication’’). 
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(iii) If x isin ¥, then —x isinFf. 
(iv) Ifxisin¥ and x ¥ 0, then ‘ isinf. 


In this chapter, we use the word field to mean “subfield of R.” There are many 
different subfields of IR. Of course, R itself is a subfield of R. So is the set Q of 
rational numbers. It is clear that R is the biggest subfield of R; it is almost as obvious 
that Q is the smallest, in the following sense. 


Theorem 12.2.9. [fF is any subfield of R, then ¥ contains all rational numbers. 


Proof. To see this, first note that 0 and | are in ¥ by property (i) of a subfield of 
R. Then property (ii) implies that 2 is in F, and 3 is in F, and so on. That is, F 
contains all the natural numbers (this can be formally established by a very easy 
mathematical induction). Property (iii) then implies that ¥ contains all integers. 
By property (iv), F contains the reciprocals of every integer other than 0, so, by 
property (ii), F contains all rational numbers. oO 


The following is an important fact. 
Theorem 12.2.10. The set of constructible numbers is a subfield of R. 


Proof. This follows immediately from Theorems 12.2.4 and 12.2.5 and 
Corollary 12.2.7. oO 


One of the fundamental theorems in this chapter (Theorem 12.3.12) will provide 
an alternative characterization of the field of constructible numbers. 


Example 12.2.11. The set Q(/2) defined by 
Q(V2) = {a+ bv2:4,beQ| 


is a subfield of R. 


Proof. It is clear that Q(V2) contains 0 (since it equals 0 + 0- /2) and | (since it 
equals 1+0- /2). Moreover, 


(a + b\v2) + (a + bv?) = (a; + az) + (b) + bn) V2 
Hence, Q(V2) is closed under addition. Furthermore, 
(a1 + biv2) (az + boV2) = (aaa + 2bib2) + (aibz + arbi) V2 


so Q(V2) is closed under multiplication. Also, —(a + by/2) = (—a) + (—b) V2. 


: 1 dog J 
F It remains to be shown that ee isin Q(V2) , whenever a and b are not both 0. 
ut, 


I a—bV/2 _a-b/2_— a ee 
a+b/2 (a +bv2) (a —bv2) oe Pop 
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which is the sum of a rational number and a number that is the product of a rational 
number and \/2 and is therefore in Q(V2). (Of course, the above expression would 
not make sense if a2—2b? = 0. However, this cannot be the case, since a2—2b? = 0 
would imply (2) = 2, and we know that V2 is irrational (Theorem 8.2.6).) oO 


The field Q(v2) is the field obtained by starting with the field Q and “adjoining 
J/2” to Q; it is called “the extension of Q by /2.” This is a special case of a much 
more general situation. 


Theorem 12.2.12. Let ¥ be any subfield of R and let r be any positive number in 
F. If /r is not inF, then 


F (Jr) = {at bvr:a,beF} 


is a subfield of R. 


Proof. The proof is very similar to the proof given above for the special case of 
Q(V2) (Example 12.2.11). It is easily seen that 0 and 1 are in F(./r), that ¥ (./r) 
is closed under addition, and that the negative of every element of F (4/r) is also in 
F (/r). To see that it is closed under multiplication, note that 


(a1 + bir) (az + bor) = (aan + rbyb2) + (arbz + arb1) Jr 


This is in F (Vr) since r is in ¥ and F itself is a field. 
Also, for any a and b inf, 


1 a—b,/r a—biJ/r _ a —b de 


atbyr (a + by/r) (a — by/r) ~ a2 — rb? = 2a age 


Note that a” — rb? ¥ 0 unless a and b are both 0, because /r ¢ F. (If a* —rb? = 0 
and b + 0, then (2)? = r, and it would follow that ./r € ¥.) Hence, ae 


F (./r) whenever a + b,/r is not 0. o 
Definition 12.2.13. If F is a subfield of R and r is a positive number that is in F 
such that ./r is not in F, then the field 

F (/r) = {a+ br :a,beF} 


is the field obtained by adjoining ./r to F and is called the extension of F by /r. 


Example 12.2.14. Since \’5 is not an element of Q(v2), the extension of Q(V2) 
by V5 is 


[a+ bV5: a,b € Q(v2)| = {(c-+av2) + (e+ fv2)V5:0,d,e, f Qh 


is in 


For present purposes, we are interested in adjoining square roots to fields of real 
numbers because that can be done in a “constructible” way. 
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Theorem 12.2.15. [fr is a positive constructible number, then ./r is constructible. 


Proof. Mark the number r + | on the number line; label it A as in Figure 12.10. 
Let M = rl. M is constructible. Make a circle with center M and radius M. The 
circle then goes through the point A and also the point corresponding to 0, which 
we label O. Use D to denote the point corresponding to r on the number line. Erect 
a perpendicular to the number line at D (Theorem 12.1.7), and let C be the point 
above the number line at which that perpendicular intersects the circle. 


Fig. 12.10 Constructing square roots 


The angle OCA is 90°, since it is inscribed in a semicircle (Theorem 11.3.11). 
Therefore, the sum of the angles OCD and DCA is 90°, from which it follows that 
the angle COD equals the angle DC A. Thus, triangle OCD is similar to triangle 
DCA, so their corresponding sides are proportional (Theorem 11.3.9). Let x denote 

x r 2 


the length of the perpendicular from C to D. Then ; = [, so x“ = r. Hence, 


x = 4/r and ./F is constructible. o 

It follows immediately from this theorem (12.2.15) and the fact that the con- 
structible numbers form a field (Theorem 12.2.10) that every number in Q(V2) is 
constructible. More generally, every element of Q(Vr) is constructible, for every 
positive rational number r such that ./F is irrational. Even more generally, if F¥ is a 
field consisting of constructible numbers and r is a positive number in ¥ such that 
,/r is not in F, then F (Jr) consists of constructible numbers. That is, if we start 
with Q and keep on adjoining square roots, we get constructible numbers. 


Definition 12.2.16. A tower of fields is a finite sequence Fo, Fi, ¥2,.-..,Fn of 
subfields of R such that fo = Q and, for each i from | to n, there is a positive 
number 7; in ¥;—; such that ,/r; is not in F;_, and F; = Fi-1 (/fi): 


Note that a tower can be described as a finite sequence Fo, F1, F2,.--, Fn of 
fields of real numbers such that 


Fo CF, CF2C-++:CFn 


with Fo = Q and each Ff; obtained from its predecessor F;_; by adjoining a square 
root. 
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12.3. Surds 


We will show that the constructible numbers are exactly those real numbers that are 
in fields that are in towers. There is a name that is sometimes used for such numbers. 


Definition 12.3.1. A surd is a number that is in some field that is in a tower. That 
is, x is a surd if there exists a tower 


Fo CF, CF2C++:CFn 


such that x is in F,. 


(It should be noted that the word “surd” is sometimes given different meanings. 
We use the definition given above because we find it most useful.) 


Theorem 12.3.2. The set of all surds is a subfield of R. Moreover, if r is a positive 
surd, then ./F is a surd. 


Proof. To show that the set of surds is a field, it must be shown that the arithmetic 
operations applied to surds produce surds. This follows immediately if it is shown 
that, for any surds x and y, there exists a field # containing both x and y that 
occurs in some tower. If { /rr A/T 2, 0005 4/Tm } are the numbers adjoined in making 
a tower that contains x and { /s1 1 A/82, 0005 /5n } are the numbers adjoined in 
making a tower containing y, then adjoining all of those numbers produces a field 
that contains both x and y. Thus, the set of surds is a subfield of R. 

To show that square roots of positive surds are surds, let r be a positive surd. 
Then r is in some field ¥ that is in a tower. If ./r is in F, then ,/F is clearly a surd. 
If \/r is not in F, then ./r is in ¥ (./r), which is in a tower that has one more field 
than the tower leading to ¥. oO 


Theorem 12.3.3. Every surd is constructible. 


Proof. This follows immediately from the theorems that the rational numbers 
are constructible (Theorem 12.2.3), that the constructible numbers form a field 
(Theorem 12.2.10), and that the square root of a positive constructible number is 
constructible (Theorem 12.2.15). oO 


The fundamental theorem that we will need is that the constructible numbers are 
exactly the surds. Given Theorem 12.3.3, this will follow if it is established that 
starting with the numbers 0 and | and performing constructions with straightedge 
and compass never produces any numbers that are not surds. Since constructions 
take place in the plane, we will have to investigate what points in the plane can be 
constructed. 


Definition 12.3.4. We say that the point (x, y) in the plane is constructible if 
that point can be obtained from the points (0, 0) and (1, 0) by performing a finite 
sequence of constructions with straightedge and compass. 
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Theorem 12.3.5. The point (x,y) is constructible if and only if both of the 
coordinates x and y are constructible numbers. 


Proof. If x and y are constructible numbers, then the point (x, y) can be constructed 
by constructing the point x on the x-axis, erecting a perpendicular to the x-axis at 
the point x (Theorem 12.1.7) and constructing y on that perpendicular. 

Conversely, if the point (x, y) has been constructed, then the number x can be 
constructed by dropping a perpendicular from (x, y) to the x-axis (Theorem 12.1.8) 
and the number y can be constructed by dropping a perpendicular to the y-axis. O 


Definition 12.3.6. The surd plane is the set of all points (x, y) in the xy-plane such 
that the coordinates, x and y, are both surds. 


By what we have shown above, every point in the surd plane is constructible 
(Theorems 12.3.3 and 12.3.5). We need to show that every constructible point is in 
the surd plane. 

After we have constructed some points, how can we construct others? We can 
use a straightedge to make lines joining any two points we have constructed, and 
we can use a compass to construct a circle centered at a constructible point with a 
radius that is constructible. New constructible points can then be obtained as points 
of intersection of lines or circles that we have constructed. 

Any one line in the plane has many different equations, as does any one circle. 
We need to know that there are equations with surd coefficients for all of the lines 
and circles that arise in constructions. 


Theorem 12.3.7. [fa line goes through two points in the surd plane, then there is 
an equation for that line that has surd coefficients. 


Proof. Suppose that (x1, yi) and (x2, y2) are distinct points in the surd plane. We 
consider two cases. If x; 4 x2, then 


y2— Yi 
= 
X12 — *1 


(x — x1) 


is an equation of the line through the points (x1, y;) and (x2, y2). Since the surds 
form a field, the coefficients in this equation are all surds. In the other case, where 
xX 1 = X2, an equation of the line is x = x}. 

In both cases, we have shown that an equation for the line through the points 
(x1, y1) and (x2, y2) can be expressed in the form ax + by = c, where a, b, and c 
are all surds and a and b are not both 0. oO 


Theorem 12.3.8. A circle whose center is in the surd plane and whose radius is a 
surd has an equation in which the coefficients are all surds. 


Proof. Let the center be (x1, y;) and the radius be r. Then one equation of the circle 
is (x — x1)* + (y — y1)* = r’. Expanding this equation and using the fact that the 
set of surds is a field shows that this equation has surd coefficients. Oo 
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Theorem 12.3.9. The point of intersection of two distinct nonparallel lines that 
have equations with surd coefficients is a point in the surd plane. 


Proof. Let such equations be a,x +b, y = c and agx +b2y = cz. We consider two 
cases, that where a, = 0 and that where a; # 0. 
If a; = 0, then a2 ¥ 0 (or else the two lines would be parallel). Then y = oo 


SO a2x + bo5- = c2 from which it follows that the intersection of the two lines has 


coordinates x = a a 2 B and y = oe both of which are surds. 
If a; #0, then x = —y + ae Substituting this in the second equation yields 


a2 (- i y+ 1) + boy = c2. Since the coefficients are all surds, it is clear that y is 
also a surd. Hence, so is x, and the theorem is proven in this case as well. oO 


We next consider the points of intersection of a line and a circle. 


Theorem 12.3.10. The points of intersection of a line that has an equation with 
surd coefficients and a circle that has an equation with surd coefficients lie in the 
surd plane. 


Proof. Let ax + by = c and (x — fis +(y- ey = r* be the equations of a line 
and a circle, respectively, in which all of the coefficients are surds. Consider first 
the case where a = 0. In this case, y = 5. Substituting this in the equation of the 
circle yields (x — f)? + C= g)* =r’. This is a quadratic equation in x. It has 0, 
1, or 2 real number solutions depending upon whether the line does not intersect the 
circle, is tangent to the circle, or intersects the circle in two points. The quadratic 
formula (Problem 6 in Chapter 9) shows that solutions that exist are obtained from 
the coefficients by the ordinary arithmetic operations and the extracting of a square 
root. All of these operations on surds produce surds. Thus any solutions x are surds, 
proving the theorem in this case. 

Ifa # 0, then x = — B y + £. Substituting this value in the equation of the circle 


yields (—2y eee fy + (y — g)* =r’. As above, any solutions of this equation 
are also surds. Therefore the theorem holds in this case too. oO 


The remaining case is the intersection of two circles. 


Theorem 12.3.11. The points of intersection of two distinct circles that have 
equations with surd coefficients lie in the surd plane. 


Proof. In order for two distinct circles to intersect, they must have distinct centers. 


Thus, the equations of the circles can be written in the form (x—a1)?+(y—b))* = fe 


and (x — a2)? + (y - by)? = ae which is equivalent to 
x? —2aix +a? + y* —2biy + bP =r? 


x* —Iaxt+at+y? —%ny+h =r 
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where (a1, b1) and (az, b2) are distinct points. (This means that aj 4 a2 or bj # bp.) 
Subtracting the second equation from the first shows that any point (x, y) that lies 
on both circles also lies on the line with equation 


(—2a, + 2ag)x + a? — a2 + (—2b1 + 2bo)y +b? — BB =r? — 72 
Since this equation has surd coefficients, all points of intersection of this line with 
either circle lie in the surd plane (Theorem 12.3.10). oO 


Theorem 12.3.12. The field of constructible numbers is the same as the field of 
surds. 


Proof. We already showed that every surd is constructible (Theorem 12.3.3). On 
the other hand, Theorems 12.3.7, 12.3.8, 12.3.9, 12.3.10, and 12.3.11 show that 
starting with surd points in the plane and doing geometric constructions produces 
only surd points. Thus, every constructible point in the plane has surd coordinates. 
Since every constructible number is a coordinate of a constructible point in the plane 
(Theorem 12.3.5), it follows that every constructible number is a surd. oO 


This characterization of the constructible numbers is the key to the proof that 
certain angles cannot be trisected. One of the relationships between constructible 
angles and constructible numbers can be obtained using the trigonometric function 
cosine, as follows. We restrict the discussion to acute angles (1.e., angles less than a 
right angle) simply to avoid having to describe several cases. 


Theorem 12.3.13. The acute angle @ is constructible with a straightedge and 
compass if and only if cos @ is a constructible number. 


Proof. Suppose first that the angle @ is constructible. Copy the angle so that its 
vertex lies at the point 0 on the number line, one of its sides is the positive part 
of the number line and the other side is on top of it, as in Figure 12.11. Use the 
compass to mark a point on the upper side of the angle that is one unit from the 
point 0. Drop a perpendicular from that point to the number line (Theorem 12.1.8). 
Then that perpendicular meets the number line at cos 0, so cos 6 is constructed. 


0 cos 0 


Fig. 12.11 Constructing the cosine of an angle 


Conversely, if cos @ is constructed, erect a perpendicular upwards from the point 
cos @ on the number line (Theorem 12.1.7). Construct the number a = V1 — cos? 0 
and mark the point on the perpendicular with that distance above the number line, 
as in Figure 12.12. Connecting the point 0 to that marked point by a straightedge 
produces the angle 0. oO 
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Fig. 12.12 Constructing an angle from its cosine 


With this background and some other preliminary results, we will be able to 
determine exactly which angles with an integral number of degrees are constructible 
(see Theorem 12.4.13). First note the following. 


Theorem 12.3.14. An angle of 60° is constructible. 


Proof. This is an immediate consequence of Theorem 12.3.13, for the cosine of 60° 
equals 5 and 5 is a constructible number. 

There is also an easy direct proof: simply construct an equilateral triangle using a 
straightedge and compass; each angle of the equilateral triangle is 60°. To construct 
an equilateral triangle, draw a circle of any radius, call it r, centered at a point A. 
Draw a line through A that intersects the circle and label a point of intersection B, 
as in Figure 12.13. Next draw a circle of radius r centered at B and label a point of 
intersection of the two circles with C. Now draw the segments AC and BC. Triangle 
ABC is an equilateral triangle (all of whose sides have length r). Oo 


Fig. 12.13 Constructing an equilateral triangle 


Corollary 12.3.15. The following angles are all constructible: 30°, 15°, 45°, 
and 75°. 


Proof. We begin with the fact that an angle of 60° is constructible (Theorem 12.3.14). 
An angle of 30° can be constructed by bisecting an angle of 60°, and an angle of 15° 
can be constructed by bisecting an angle of 30° (Theorem 12.1.4). An angle of 45° 
can be constructed by adding an angle of 15° to one of 30°, and an angle of 75° can 
be constructed by adding an angle of 15° to an angle of 60° (Theorem 12.1.9). O 
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The material about constructible numbers was developed primarily to prove that 
some angles are not constructible. We need some additional preliminary results. 


Theorem 12.3.16. For any angle 6, cos(30) = 4cos? 6 — 3 cos 6. 


Proof. Recall the addition formulae for cosine and sine: 

cos (6; + 62) = cos 0; cos 62 — sin 0] sin 62 
and 

sin (0; + 62) = sin 6; cos 62 + sin 6 cos 61 
In particular, if @ = 0; = 62, then 

cos(20) = cos” 6 — sin’ 6 
and 
sin(20) = 2 sin@ cos @ 
Therefore, 
cos(30) = cos(26 + @) 

cos(20) cos 8 — sin(26@) sin@ 
(cos? @ — sin? 6) cos @ — 2sin@ cos @ sin @ 


= cos? @ — sin? 6 cos@ — 2 sin? 6 cos 6 
= cos? @ — 3sin* 6 cos6 


The trigonometric identity sin @ + cos? @ = 1 implies that sin?@ = 1 — cos”, 
which gives 
cos(30) = cos? @ — 3(1 — cos” 6) cos 
= cos? @ — 3cos@ + 3.cos* 6 


= 4cos? 6 — 3cos6 


Therefore, cos(3@) = 4cos? 6 — 3cos@. oO 
The case where 6 is an angle of 20° is of particular interest. 

Corollary 12.3.17. If x = 2cos(20°), then x* — 3x —1=0. 

Proof. Using the formula for cos(30) given above and the fact that the cosine of 60° 


equals 5 we have 5 = 4cos3(20°) — 3 cos(20°). This is equivalent to the equation 


8 cos?(20°) — 6cos(20°) — 1 = 0 
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Since x = 2.cos(20°), x3 — 3x —1=0. Oo 


We will show that the cubic equation x? —3x—1 = 0 does not have a constructible 
root. We need some preliminary results. 


Theorem 12.3.18. If the roots of the cubic equation x* + bx? + cx +d = O are 
r1,12, and r3, then b = —(r} +12 +73). (It is possible that two or even three of the 
roots are the same as each other.) 


Proof. By the Factor Theorem (9.3.6), and the fact that the coefficient of x? is 1, 
the cubic equation is the same as (x — 7r1)(x —r2)(x —1r3) = 0. Multiplying this out 
shows that the coefficient of x? is —(r) + r2 +73); ie., b= —(r1 tro +13). oO 


We need the concept of a conjugate for elements of F (4/7), analogous to the 
conjugate of a complex number. 


Definition 12.3.19. If a + b,/r is an element of ¥(,/r), then the conjugate of 
a+ b,/r, denoted by placing a bar on top of the number, is 


a+b Jr =a—bJr 


Theorem 12.3.20. The conjugate of the sum of two elements of F (s/r) is the sum 
of the conjugates, and the conjugate of the product of two elements of F (Vr) is the 
product of the conjugates. 


Proof. For the first assertion, simply note that 


(a+b/r) + (c+d Jr) = (a+c)+ (6+4)/r 
= (a +c) — (b+ d)(\/r) 
= (a — bir) + (c — dr) 
= (a+ b,/r) + (c+ d,/r) 


For products, note that 


(a+ bJr)(c + di/r) = (ac + rbd) + (ad + be)./r 
= (ac + rbd) — (ad + bc)./r 


and 


(a+ by/r)- (c+ dy/r) = (a— b/r)(c — dy/r) 
= (ac + bdr) — (ad + bc)./r 


Therefore, (a + b/r)(c + dy/r) = (a+ by/r) - (c+ dy/r). o 


Theorem 12.3.21. If a + b,/r is in F (./r) and is a root of a polynomial with 
rational coefficients, then a — b,/r is also a root of the polynomial. 
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Proof. Suppose that ay (at+b/r)"+an—1 (atbyr)" "+ -+++a)(at+b/r)+ao =0. 
Then, 


dn(a+ dr)" + dn—i(a+ dvr)" | +--+ Faia + bvF) +49 =0 = 0 


Since each of the coefficients a, is rational, a, = ax, for every k. Using this fact and 
the facts that the conjugate of a sum is the sum of the conjugates and the conjugate 
of a product is the product of the conjugates (Theorem 12.3.20), it follows that 


an(a + by/r)"+an—1(a + br)" '4- --+a1(a + by/r)+a9 = 0. Thus, a—b/r = 
a +b,/r is also a root of the polynomial. Oo 


Theorem 12.3.22. [fa cubic equation with rational coefficients has a constructible 
root, then the equation has a rational root. 


Proof. Dividing through by the leading coefficient, we can assume that the coeffi- 
cient of x? is 1. Then, by Theorem 12.3.18, the sum of the three roots of the cubic 
equation is the negative of the coefficient of x7, and is therefore rational. 

We first show that if the equation has a root in any F (Vr) , then it has arootin¥. 
To see this, suppose the equation has a root in F (./r) of the form a + b,/r with 
b #0. Then, by Theorem 12.3.21, the conjugate a — b./r is also a root. If r3 is the 
third root and s is the sum of all three roots, then s = r3+ (a + b,/r) ++ (a - b,/r) = 
r3 + 2a. Thus, r3 = s — 2a. Since ¥ contains all rational numbers and s is rational, 
s isin ¥. Since a is also inf, it follows that the root r3 is in F itself. 

The preliminary result obtained in the previous paragraph allows us to prove the 
theorem, as follows. If the polynomial has a constructible root, then, since every 
constructible number is a surd (Theorem 12.3.12), the root is in a field that occurs 
at the end of a tower. Consider the field at the end of the shortest tower that contains 
any root of the given cubic equation. We claim that field is Q. To see this, simply 
note that if that field was (7); the previous paragraph would imply that there is 
a root in f, which would be at the end of a shorter tower than F (4/r) is in. Hence, 
that field is Q. Thus, the equation has a rational root. oO 


We can now prove that an angle of 20° cannot be constructed. 


Theorem 12.3.23. An angle of 20° cannot be constructed with straightedge and 
compass. 


Proof. If an angle of 20° could be constructed with straightedge and compass, 
then cos(20°) would be a constructible number (Theorem 12.3.13). Then 2 cos(20°) 
would also be a constructible number, and the polynomial x? — 3x — 1 = 0 would 
therefore have a constructible root (Corollary 12.3.17). It follows from the previous 
theorem (12.3.22) that this polynomial would have a rational root. Thus, to establish 
that an angle of 20° is not constructible, all that remains to be shown is that the 
polynomial x? — 3x — 1 = 0 does not have a rational root. 
Suppose that m and n are integers with n # 0 and that ™, 
terms, is a root of the equation x? — 3x — 1 = 0. Then, by the Rational Roots 


written in lowest 
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Theorem (8.2.14), m divides —1 and n divides 1. Thus, m is either 1 or —1 andn 
is either | or —1. Hence, - is 1 or —1. Therefore, the only possible rational roots 
of x? — 3x — 1 = O are x = 1 or x = —1. Substituting those values for x into the 
equation shows that neither is a root, so the theorem is proven. Oo 


Corollary 12.3.24. An angle of 60° cannot be trisected with straightedge and 
compass. 


Proof. As we have seen, an angle of 60° can be constructed with a straightedge and 
compass (Theorem 12.3.14). If an angle of 60° could be trisected with straightedge 
and compass, then an angle of 20° would be constructible. But an angle of 20° is 
not constructible, by the previous theorem (12.3.23). oO 
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Another problem that the Ancients Greeks raised but could not solve was what they 
called duplication of the cube. This was the question of whether or not a side of a 
cube of volume 2 could be constructed by straightedge and compass. 


Theorem 12.4.1. The side of a cube of volume 2 cannot be constructed with a 
straightedge and compass. 


Proof. If x is the length of the side of a cube of volume 2, then, of course, x3 =2,0r 
x? —2 = 0. By Theorem 12.3.22, this equation has a constructible root if and only if 
it has a rational root. Since the cube root of 2 is irrational (Problem 13 in Chapter 8), 
there is no constructible root. Therefore, the cube cannot be “duplicated” using only 
a straightedge and compass. oO 


The question of which regular polygons can be constructed is very interesting. 


Definition 12.4.2. A polygon is a figure in the plane consisting of line segments 
that bound a finite portion of the plane. A regular polygon is a polygon all of whose 
angles are equal and all of whose sides are equal. 


An equilateral triangle is a regular polygon with three sides. Equilateral trian- 
gles can easily be constructed with straightedge and compass (see the proof of 
Theorem 12.3.14). 

A square is a regular polygon with four sides. It is also very easy to con- 
struct a square. Simply use the straightedge to draw any line segment, and erect 
perpendiculars at each end of the line segment (Theorem 12.1.7). Then use the 
compass to “measure” the length of the line segment and mark points which are 
that distance above the original line segment on each of the perpendiculars. Using 
the straightedge to connect those points yields a square. 

For each natural number n greater than or equal to 3, there exists a regular 
polygon with n sides. This can be seen as follows. (Which of these regular polygons 
is constructible is a more difficult question that we discuss in Theorem 12.4.5.) 


154 12 Constructibility 


Theorem 12.4.3. For each natural number n greater than or equal to 3, there is a 
regular polygon with n sides inscribed in a circle. 


Proof. Given a natural number n greater than or equal to 3, take a circle and draw 
(although it may not be possible to construct) successive adjacent angles of 300 
degrees at the center, as shown in Figure 12.14 for the case n = 6. Then draw 
the line segments connecting adjacent points determined by the sides of the angles 
intersecting the circumference of the circle. We must show that those line segments 
are all equal in length and that the angles formed by each pair of adjacent line 
segments are equal to each other. 


Fig. 12.14 Existence of regular polygons 


Consider, for example, the triangles OAB and OC D in Figure 12.14. The angles 
AOB and COD are each equal to 300 degrees. The sides OA, OB, OC, and OD 
are all radii of the given circle and are therefore equal to each other. It follows that 
AOAB is congruent to AOCD by side-angle-side (11.1.2). The same proof shows 
that all of the triangles constructed are congruent to each other. It follows that all of 
the sides of the polygon, which are the sides opposite the angles of 300 degrees in 
the triangles, are equal to each other. The angles of the polygon are angles such as 
ABC and ZBCD in the diagram. Each of them is the sum of two base angles of 
the drawn triangles, and, therefore, the angles of the polygon are equal to each other 
as well. Oo 


Definition 12.4.4. A central angle of a regular polygon with n sides is an angle of 


360° that has a vertex at the center of the polygon, as in the above proof. 


Theorem 12.4.5. A regular polygon is constructible if and only if its central angle 
is a constructible angle. 


Proof. Suppose that a regular polygon can be constructed with straightedge and 
compass. Then its center (a point equidistant from all of its vertices) can be 
constructed as the point of intersection of the perpendicular bisectors of two 
adjacent sides of the polygon (see Problem 13 at the end of this chapter). Now 
the central angle can be constructed as the angle formed by connecting the 
center to two adjacent vertices of the polygon. All such angles are equal to each 
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other, since the corresponding triangles are congruent by side-side-side (11.1.8). 
There are n such angles, the sum of which is 360 degrees, so each central angle 


is 300°, 
n 


Conversely, suppose that, for some natural number n > 3, an angle o S 
constructible. Then a regular polygon with n sides can be constructed as follows. 
Construct a circle. Construct an angle of 360° with vertex at the center of the circle. 
Then construct another such angle adjacent to the first, and so on until n such angles 
have been constructed. Connecting the adjacent points of intersection of the sides of 
those angles with the circle constructs a regular polygon with n sides (as shown in 
the proof of Theorem 12.4.3). oO 


f 360° 5 
n 


Corollary 12.4.6. A regular polygon with 18 sides cannot be constructed with a 
straightedge and compass. 


Proof. A regular polygon with 18 sides has a central angle of aun = 20 degrees. We 
proved in Theorem 12.3.23 that an angle of 20° is not constructible, so the previous 
theorem implies that a regular polygon with 18 sides is not constructible. oO 


Theorem 12.4.7. [fm is a natural number greater than 2, then a regular polygon 
with 2m sides is constructible if and only if a regular polygon with m sides is 
constructible. 


Proof. Using Theorem 12.4.5, the result follows by either bisecting or doubling the 
central angle of the already constructed polygon. Oo 


Corollary 12.4.8. A regular polygon with 9 sides is not constructible. 


Proof. This follows immediately from the fact that a regular polygon with 18 sides 
is not constructible (Corollary 12.4.6) and the above theorem (12.4.7). oO 


It is useful to make the following connection between constructible polygons and 
constructible numbers. 


Theorem 12.4.9. A regular polygon with n sides is constructible if and only if the 
length of the side of a regular polygon with n sides that is inscribed in a circle of 
radius I is a constructible number. 


Proof. In the first direction suppose that a regular polygon with n sides is con- 
structible. Then such a polygon can be constructed so that it is inscribed in a 
circle of radius | (for example, by putting its constructible central angle in a circle 
of radius 1). The length of the side can be constructed by using the compass to 
“measure” the side of the constructed polygon. 

Conversely, if s is a constructible number and s is the length of the side 
of a regular polygon with n sides inscribed in a circle of radius 1, then the 
regular polygon can be constructed by marking any point on the circle and then 
using the compass to successively mark points that are at distance s from the 
previously marked one. The marked points will be vertices of a regular polygon 
with n sides. Oo 
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Can a pentagon (a regular polygon with 5 sides) be constructed using only a 
straightedge and compass? The answer is affirmative, but this is not at all easy to 
see directly. We will approach this by first considering a regular polygon with 10 
sides. 


Theorem 12.4.10. A regular polygon with 10 sides is constructible. 


Proof. By Theorem 12.4.9, it suffices to show that the length of a side of such a 
polygon inscribed in a circle of radius 1 is a constructible number. We determine 
the length of such a side by using a little geometry. The central angle of a regular 
polygon with 10 sides is 36°. Consider such an angle with vertex O at the center of 
a circle of radius 1, as shown in Figure 12.15. Label the points of intersection of the 
sides of that central angle with the circle A and B. Let s denote the length of the line 
segment from A to B, and let AC be the bisector of /OAB. Since ZOAB is 72° 
(the sum of the degrees of the equal angles OAB and ABO must be 180° — 36°), 
it follows that angles OAC and CAB are each 36°. Also, OBA is 72°. Thus, 
triangles OAB and ABC are similar to each other, so their corresponding sides are 
in proportion (Theorem 11.3.9). Therefore, triangle ABC is isosceles, and AC has 
length s. 


Fig. 12.15 The side of a ten-sided regular polygon 


Since AOB = 36° = OAC, AOAC is also isosceles. Thus, OC has length s, 
from which it follows that BC has length 1 — s. Since the corresponding sides of 
triangles OAB and ABC are in proportion, it follows that 


Ss 1 
l-s s 
Hence, the length we are interested in, s, satisfies the equation s* = 1—s,or 


s? +s —1 = 0. The positive solution of this equation (s is a length) is see ey 
Therefore s is a constructible number (Theorem 12.3.12), from which it follows 


that the regular polygon with 10 sides is constructible. oO 
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Corollary 12.4.11. A regular pentagon is constructible. 
Proof. This follows immediately from the above theorem and Theorem 12.4.7. O 


Which regular polygons are constructible? As we have shown, those with 3, 4, 
and 5 sides are, and thus so are those with 6, 8, and 10 sides (Theorem 12.4.7). We 
proved that a regular polygon with 9 sides is not constructible (Corollary 12.4.8). 

What about a polygon with 7 sides? We can approach this question using 
some facts that we learned about complex numbers. As follows immediately from 
a previous result (Example 9.2.13), for each natural number n greater than 2, 
the complex solutions to the equation z” = 1 are the vertices of an n-sided 
regular polygon inscribed in a circle of radius 1. We will approach the problem 


by considering the solutions of z? = 1. 
Theorem 12.4.12. A regular polygon with 7 sides is not constructible. 


Proof. If a regular polygon with 7 sides was constructible, then one could be 
constructed inscribed in a circle of radius 1 centered at the origin, such that one 
of the vertices lies on the x-axis at the point corresponding to the number 1. Then 
the vertices are the 7" roots of unity (Example 9.2.13); that is, they satisfy z’ = 1. 

We will analyze the first vertex above the x-axis. Let that vertex lie at the 
complex number Zo. If the regular polygon was constructible, then zo would be a 
constructible point, and therefore the real part of zo would be constructible (simply 
use Theorem 12.1.8 to construct a perpendicular from zo to the x-axis). It would 
follow that twice the real part is constructible. Let x9 be twice that real part. We 
will show that xo satisfies a cubic equation that is not satisfied by any constructible 
number. 

Begin by observing that x9 = zo + Zo. Since |zo| = 1, it follows that | = \zol? = 
zozo. Thus, zo = ai sO x9 = zo + ae The cubic equation satisfied by xo will be 
obtained from the equation zh = | and the fact that z9 ~ 1. Note that 


g-l=@-D(s+pt+atad+p+0+!) 
Since zo — 1 £ 0, 
o+atwuytatuatumtl=0 


Dividing through by z? yields 


o. 2 {ta 
2g FP 2g hep +l + — +a +a = 0 
£0 ZO 20 


i 3 3 i 1\7 2 
Note that (zo + 4) = +3200+2+(+) and also that (zo + +) =%t2+ 


i\2 
(4) . It follows that 


20 
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a i\? iy" 1 

3 2 

dtbtotl+—+z+4=(0+=) +(o+=) -2(a+=)-1 
Z0 20 ca) Z0 £0 Z0 


Then, since x9 = zo + x Xo satisfies the equation 
xg +22 —2x9 -1=0 


As indicated, to show that a regular polygon with 7 sides is not constructible, it 
suffices to show that xo is not a constructible number. Since xo satisfies this cubic 
equation with rational coefficients, the result will follow if it is shown that this cubic 
equation has no rational root (Theorem 12.3.22). Suppose © is a rational root written 
in lowest terms. Then, by the Rational Roots Theorem (8.2.14), m divides —1 and 
n divides 1. Therefore, m is | or —1, and is 1 or —1. Thus, od equals 1 or —1. But 
13 + 17 —2—1is not 0, nor is ( i" + ( 17 +2 -— 1. Hence, there is no rational 
solution, and the theorem is proven. Oo 


In a sense, it is known exactly which regular polygons are constructible. The 
Gauss-Wantzel Theorem (which we will not prove) states that a regular polygon 
with n sides is constructible if and only if n is 2‘, where k is an integer greater 
than 1, orn is 2* F, 1:::/, where k is a nonnegative integer and the F j are distinct 
Fermat primes. Recall (Problem 14 in Chapter 2) that a Fermat number is a number 
of the form 27" + 1 for a nonnegative integer n. A Fermat prime is a Fermat number 
that is prime. The first few Fermat numbers are 3 (when n = 0), 5 (when n = 1), 
17 (when n = 2), and 257 (when n = 3). Fermat thought that all Fermat numbers 
might be prime, but Euler found that the sixth Fermat number (when n = 5) is 
not prime. It is a remarkable fact that it is still unknown whether or not there are 
an infinite number of Fermat primes. (It is equally remarkable that it is not known 
whether there are infinitely many composite Fermat numbers.) It is therefore not 
known whether or not there are an infinite number of constructible regular polygons 
with an odd number of sides. 

We can determine exactly which angles having a natural number of degrees are 
constructible. 


Theorem 12.4.13. [fn is a natural number, then an angle of n degrees is con- 
structible if and only if n is a multiple of 3. 


Proof. Recall that we proved that a regular polygon with 10 sides is 
constructible (Theorem 12.4.10) and, hence, that an angle of 36° is constructible 
(Theorem 12.4.5). Since an angle of 30° is constructible (Corollary 12.3.15), we 
can “subtract” a 30° angle from a 36° angle by placing the 30° angle with the vertex 
and one of its sides coincident with the vertex and one of the sides of the 36° angle 
(Theorem 12.1.6). Then, bisecting the constructed angle of 6° yields an angle of 3°. 
Once an angle of 3° is constructed, an angle of 3k degrees can be constructed by 
simply constructing k angles of 3° appropriately adjacent to each other. 

To establish the converse, suppose that an angle of n degrees is constructible. We 
must show that n is congruent to 0 (mod 3). If n were congruent to either | or 2 
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modulo 3, then we could construct an angle of 1° or 2° accordingly by “subtracting” 
an appropriate number of angles of 3° from the angle of n degrees. If the resulting 
angle is 2°, bisecting it would yield an angle of 1°. Thus, if an angle of n degrees 
was constructible for any n that was not a multiple of 3, then an angle of 1° could 
be constructed. But an angle of 1° is not constructible, for, if it was, placing 20 
of them together would contradict the fact that an angle of 20° is not constructible 
(Theorem 12.3.23). oO 


We have shown that some angles, such as an angle of 60°, cannot be trisected 
with a straightedge and compass. But what about the following? 


Example 12.4.14 (Trisection of arbitrary acute angles). Let @ be any acute angle. 
Mark any two points on a straightedge and let the distance between them be r. Given 
the angle 6, construct the circle with radius r whose center is at the vertex of 6. Label 
the center of the circle O. Extend one of the sides of @ in both directions. Move the 
marked straightedge so that the point marked to the left is on the extended line, the 
point marked to the right stays on the circle, and the straightedge passes through the 
intersection of the circle and the side of 6 that was not extended; label the points of 
intersection A, B, C, as shown in Figure 12.16. 


Fig. 12.16 On the way to trisecting an arbitrary angle 0 


Draw the line BO. Then the line segments AB, BO, and OC all have length r. 
Now let the equal base angles of AABO be x, the equal base angles of AO BC be 
y, and let ZBOC be z, as shown in Figure 12.17. Then the sum of ZABO and 2x 
is 180°, and the sum of ZABO and y is also 180°; hence y = 2x. It is clear that 
x + z+ is 180°. On the other hand, z + 2y is 180°. Since y = 2x, 4x + z is also 
180°. It follows that 4x + z= x+2z+06, or 3x = 0. Thus, the angle x is one third 
of 6, so 6 has been trisected. oO 


What is going on here? You may think that the construction we have just done 
contradicts our earlier proof that an angle of 60° cannot be trisected. However, the 
construction in the example above violated the classical rules of constructions that 
we were adhering to before this example. Namely, we marked two points on the 
straightedge. 
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Fig. 12.17 Trisecting an arbitrary angle 0 


What we have shown is that it is possible to trisect arbitrary angles with a 
compass and a straightedge on which two (or more) points are marked. Therefore, 
in particular, any angle can be trisected using a ruler and compass, but not merely 
using a straightedge and compass. 


12.5 Problems 
Basic Exercises 


1. Determine which of the following numbers are constructible: 


© Teh o (% 
(b) 79 G) cos 51° 
(c) 3.146891 (k) cos 5° 
(d) “79 (1) COs 10° 
() yo+ eee 


) VT4¥3 aa 
(g) V3 4+4V24 V5 


5 
(p) 22 
i) = 
(h) Ea (q) a v2 {Hint: Consider its square.] 


(r) “7 cos 15° 


2. Determine which of the following angles are constructible: 


(a) 6° (f) 15° (j) 37.5° 
(b) 5° (g) 75° (k) 7.5° 
(c) 10° (h) 80° (1) 120° 
(d) 30° (i) 92.5° (m) 160° 


(e) 35° 
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3. 


Determine which of the following angles can be trisected: 


(a) 12° 
(b) 30° 


Interesting Problems 


4. 


10. 


Determine which of the following polynomials have at least one constructible 
root: 


(a) x4 -—3 (HS GST 

(b) x8 —7 (g) x8 +4x4+1 

(ey al Te a 1 (iy) 2 eae ST 
(d) x3 + 6x? + 9x — 10 Gx? Spe 1 
(e) x3 — 3x? -2x +6 (j) 2x3 — 4x7 +1 


. Determine which of the following regular polygons can be constructed with 


straightedge and compass: 


(a) A regular polygon with 14 sides 
(b) A regular polygon with 20 sides 
(c) A regular polygon with 36 sides 
(d) A regular polygon with 240 sides 


. Explain how to construct a regular polygon with 24 sides using straightedge and 


compass. 


. True or False: 


(a) If the angle of 6 degrees is constructible and the number x is constructible, 
then the angle of x - 6 degrees is constructible. 

(b) x” is constructible if x and y are each constructible. 

(c) If * is constructible, then x and z are each constructible. 

(d) There is an angle 0 such that cos@ is constructible but sin@ is not 
constructible. 


. For an acute angle 6, show that tan 6 is a constructible number if and only if @ 


is a constructible angle. 


. Determine which of the following numbers are constructible: 


(a) sin 20° 
(b) sin 75° 
(c) tan2.5° 


Determine which of the following numbers are constructible (the angles below 
are in radians): 


: Iv 
(a) sin 7 
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11. 


12. 
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(b) cos z 
(c) tan Z 


(a) Prove that the cube cannot be tripled, in the sense that, starting with an edge 
of a cube of volume 1, an edge of a cube of volume 3 cannot be constructed 
with straightedge and compass. 

(b) More generally, prove that the side of a cube with volume a natural number 


: ‘ : Sites ie aa 
n is constructible if and only if 13 is a natural number. 
Using mathematical induction, prove that, for every integer n > 1, a regular 


polygon with 3 - 2” sides can be constructed with straightedge and compass. 


Challenging Problems 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


Prove that the center of any given regular polygon can be constructed using only 
a straightedge and compass. 

{Hint: The center can be determined as the point of intersection of the 
perpendicular bisectors of two adjacent sides of the polygon. To prove that 
this point is indeed the center, prove that all the right triangles with one side 
a perpendicular bisector of a side of the polygon, another side a half of a side of 
the polygon, and the third side the line segment joining the “center” to a vertex 
of the polygon are congruent to each other.] 

Prove that an acute angle cannot be trisected with straightedge and compass if 
its cosine is: 


(a) 
(b) 
(c) 


(d) 
(e) 


MPR UNS 
Ble UI 


Can a polynomial of degree 4 with rational coefficients have a constructible root 
without having a rational root? 
Prove that the following equation has no constructible solutions: 


x? — 6x +272 =0 


[Hint: You can use Theorem 12.3.22 if you make an appropriate substitution. ] 
Let t be a transcendental number. Prove that {(a + bt): abe Q} is not a 
subfield of R. 

Say that a complex number a + bi is constructible if the point (a,b) is 
constructible (equivalently, if a and b are both constructible real numbers). 


Show that the cube roots of 5 + ae are not constructible. 
Let ¥ be the smallest subfield of R that contains zr. 
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20. 
21. 


22. 


23. 


24. 
20. 


26. 


27. 


28. 


(a) Show that ¥ consists of all numbers that can be written in the form nae 
where p and g are polynomials with rational coefficients and qg is not the 
zero polynomial. 


(b) Show that ¥ is countable. 
Is [av2 :ae Q| a subfield of IR? 


Is the set of all towers countable? (Recall that a tower is a finite sequence of 
subfields of R, the first of which is Q, such that the other subfields are obtained 
from their predecessors by adjoining square roots; see Definition 12.2.16.) 
Prove the following: 


(a) If x9 is a root of a polynomial with coefficients in Pilar). then xo is a root 
of a polynomial with coefficients in F. 

(b) Every constructible number is algebraic. 

(c) The set of constructible numbers is countable. 

(d) There is a circle with center at the origin that is not constructible. 


Let t be a transcendental number. Prove that ¢ cannot be a root of any equation 
of the form x? + ax + b = Oif a and b are constructible numbers. 

Is there a line in the plane such that every point on it is constructible? 

Find the cardinality of each of the following sets: 


(a) The set of roots of polynomials with constructible coefficients 

(b) The set of constructible angles 

(c) The set of all points (x, y) in the plane such that x is constructible and y is 
irrational 

(d) The set of all sets of constructible numbers 


(Very challenging) Use a straightedge and compass to directly (without first 
constructing its central angle or the length of the side of any polygon) construct 
a regular pentagon. 

Suppose that regular polygons with m sides and n sides can be constructed and 
m and n are relatively prime. Prove that a regular polygon of mn sides can be 
constructed. 

[Hint: Use central angles and use the fact that a linear combination of m and n 
is 1.] 

Prove the following: For natural numbers m and n, if a given angle can be 
divided into n equal parts using only a straightedge and compass, and if m is 
a divisor of n, then the angle can be divided into m equal parts using only a 
straightedge and compass. 
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29. (Very challenging) Prove that you cannot trisect an angle by trisecting the side 
opposite the angle in a triangle containing it. That is, prove that, if ABC is any 
triangle, there do not exist two lines through A such that those lines trisect both 
the side BC of the triangle and the angle BAC of the triangle. 

[Hint: Suppose that there do exist two such lines. The lines then divide the 
triangle into three sub-triangles. One approach uses the easily established fact 
that all three sub-triangles have the same area.] 


Chapter 13 Mm) 
An Introduction to Infinite Series om 


What is .3333... (where the sequence of 3’s continues forever)? Presumably, this 
expression means 


ae eae eee 
10" 102 * 103" 104 


where the “- - -” indicates that the “sum” continues indefinitely. But then, what does 
that mean? What does it mean to add up an infinite number of terms? Is the sum 5? 


Or is it merely close to 5? Does it have a precise meaning? 
Similarly, what is 


1 1 1 1 

2 = 4 a 8 5 16 = 
(where the indicated sum continues forever)? Expressions such as these are called 
“infinite series.” In some cases, as we shall see, there is a natural way of defining 
the sum of an infinite series. 

In this chapter, we present the basic facts about infinite series, emphasizing 
an understanding of the fundamental concepts. In order to make the central idea 
more accessible, our approach is a little unorthodox; the more standard approach is 
explained in Problem 27 at the end of the chapter. 


13.1 Convergence 


The rough idea is to define the sum of an infinite series to be S if adding any large 
enough (but finite) number of terms produces a sum that is very close to S. We will 
have to make precise what is meant by “very close.” Before we present the formal 
definitions, we discuss the following example: 
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1 1 1 1 

2 = 4 = 8 + 16 7: 
bis know what is meant by } 5 41 qo it is 3 Pie a carey is meant by 4 at+q itt 5 itis 
4. Similarly, we know what is meant os z+ 3 a t+ + s, or by the sum of any fulie 
number of terms of this series. The sum of all the infinitely many terms is defined 
as a kind of a limit, as we now explain. 

For each natural number n, let S, = 5 + i + Z feeet an denote the sum of the 

first n terms of the series. We seek a simpler formula for Sy. Mulniolyne both sides 


of the equation defining S,, by 5 yields 5Sn — i + : + i a a aT ee Therefore, 


S Sy = oe oe eee Ss 
De NO A 8 Qn 4° 8 16 gat 
1 


In other words, 5Sn = 5 sr Thus, S;, = 2(5 - sat) = =1- 

This formula shows that S, is close to | if m is large. In fe no matter how 
close we want S, to be to 1, we can guarantee that S,, will be at least that close by 
phicosing n sufficiently large. To see this, note that the Be between S,, and 1 
is ar Thus, for example, we can pune that S, is within + of | by taking n to 
be greater than or equal to 4, since 4 is less than i: We can guarantee that S, is 


within 0 of 1 by choosing n to be greater than or equal to 10, since shi (that is, 


iu) is less than i000" Since S;, is very close to | when n is large, the definition of 
the sum of an infinite series that we give below (Definition 13.1.4) implies that the 
sum of this infinite series is 1. 

To define precisely what we mean by the sum of an infinite series, it is useful to 
have a notation for the distance between two real numbers. The following definition 
will play a central role. 


Definition 13.1.1. The absolute value of a real number a, denoted |a|, is defined to 
be the distance from a to 0 on the number line. More precisely, if a is positive or 
zero, then |a| is just a, and if a is negative, then |a| is equal to —a. 


= =, Ss fi = 3 3 ee ed 
For example, |1| = 1, | — 5] = —(-5) = 5 = ( +) = 4, 
and |O| = 0. Therefore, the absolute value has the effect of “making a number 
nonnegative.” 
5 0 5 


Fig. 13.1 Solutions to |a| = 5 
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Consider the question: “Which real numbers a satisfy |a| = 5?” That is, “Which 
real numbers a have a distance of 5 from 0?” As can been seen by looking at the 
real line (as in Figure 13.1), there are only two directions one can move away from 
zero: the positive direction and the negative direction. Thus, there are only two real 
numbers that have a distance of 5 from 0; namely, 5 and —5. More generally, for 
every positive real number d, there are precisely two real numbers with absolute 
value equal to d; namely, d and —d. 

What does |a| < 5 mean? This means that a has a distance from 0 that is less 
than or equal to 5. Looking at the real number line, we see that this means that a 
cannot be larger than 5 and also cannot be less than —5. In other words, |a| < 5 is 
equivalent to —5 < a < 5. More generally, if d > 0, then |a| < d is equivalent to 
—d<a<d. 

The distance between two numbers a and b can be expressed using absolute 
values as |a — b|. We show this as follows. Since this is clearly true when a or b 
is zero, we need only consider the cases where a and b are both nonzero. When a 
and b are both positive numbers it is not hard to see that the distance between them 
is |a — b|; simply picture a and b on a number line, and observe that, regardless of 
which is larger, the larger minus the smaller is |a — b| (which is the same as |b —a}). 
For example, the distance between 2 and 5 is 3, and whether we write |5 — 2| or 
|2—5|, we get 3. The case where a and b are both negative numbers can be obtained 
from the previous case by noting that the distance between a and b is the same as the 
distance between —a and —J, which is |(—a) — (—b)| = |—a+b| =|-—(a—b)| = 
la —D|. 

The case that takes a little more thought is the case where one of a and b is 
positive and the other is negative. Suppose that a is negative and b is positive (the 
other way around is handled analogously). Looking at the number line (Figure 13.2), 
since 0 is between a and b, we see that the distance between a and b is equal to the 
distance from a to 0 plus the distance from b to 0. That is, the distance between a 
and b is equal to |a| + |b]. Since a < 0 and b > 0, |a| + |b| is equal to (—a) +b = 
b—a = |b—a| = |a—b|. Thus, the distance between a and b is |a — D| in all cases. 


a 0 b 


Fig. 13.2 Distance between two real numbers 


Using absolute values to denote distances between numbers is useful in formu- 
lating the precise definition of what is meant by the sum of an infinite series. 
We return to the example 5 + i + +--+. We want to capture the idea that, no 


matter how close we require the sum S, = 5 + i + , fore a to be to 1, we can 
get it that close by including a sufficient number of terms; i.e., by choosing n to be 
sufficiently large. 

It is common in mathematics to use the Greek letter epsilon, written in the form 
€, to denote a small positive real number, as in the following. For every positive 
number ¢, there is a natural number N such that S, is within e of 1 whenever n is 
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greater than or equal to N. This can be reformulated in terms of absolute values as 
the statement: For every ¢ > 0 there is a natural number WN such that |S, — 1| < ¢ 
whenever n > N. This precisely captures what is meant by the requirement that S,, 
gets arbitrarily close to 1 when n is sufficiently large. 

To see that this precise statement holds for S, = 5 + 5 + t feet + i=. 


Qn > 
first note that |S, — 1] = | an = ar Therefore, we must show that, for every 


€ > O, there exists a natural number N such that + < € whenevern > N. 
Let e be any positive number. We find such an WN by first noting that, since 


powers of 2 can be arbitrarily large, there is an N such that 2 > i. Fix such 


an N. Multiplying both sides of i < 2% by aN gives md < ¢.Ifn > N, then 
1 


an * < ¢. Therefore, for each ¢ > 0, if N is large enough that 2 is greater 
than i, then S, is within ¢ of 1 whenever n > N, as desired. This proves that, 
according to the general definition we give below (13.1.4), the sum of the infinite 
series ;+ 5+ 3t+igt:: is 1. 

Definition 13.1.2. An infinite series is an expression of the form 

a, +a2+a43+--- 


where the a; are real numbers and the indicated sum continues forever. The a; are 
called the terms of the series. 


Some examples of infinite series are: 


Lodi 


7 $30 4 BP 47 28 


i ac 
123° 133° 143° 153 


Any particular infinite series may or may not have a sum according to the 


definition given below. The general definition is the same as the precise formulation 
in the special case considered above where S, = 1 — aa The definition captures the 


rough idea that adding a large but finite number of terms of the series gives a partial 
sum that is close to the “sum” of the entire series. 
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Definition 13.1.3. For an infinite series 
a, +a2+a3+a4+-:- 
the n‘” partial sum, often denoted by Sj, is the sum of the first n terms. That is, 
Sp =a, +a2+a34+---+a 


Definition 13.1.4. The infinite series with partial sums S, converges to S (or has 
sum S) if, for every ¢ > O, there is a natural number N such that |S, — S| < ¢ 
whenever n > N. 


As we showed above, the series 4 + i + Z +--+ converges to | according to this 
definition. 

Note that if there is any N that satisfies the definition for a particular ¢, then there 
are infinitely many such N, for if No satisfies the definition, so does any WN larger 
than No. 

Note also that a series does not converge to a given S if there is any € greater than 
0 for which there is no N satisfying the definition. (Of course, if ¢9 is such an ¢, 
then so is any smaller positive number.) 

For some infinite series, such as the example above where S = 1, it is not hard 
to determine the sum. There are other infinite series that can be shown to converge 
but for which finding an expression for the sum (other than the infinite series itself) 
is very difficult or even impossible. Moreover, there are infinite series that do not 
converge to any sum at all. 


Definition 13.1.5. An infinite series converges if there is some S for which the 
series converges to S according to Definition 13.1.4. If a series does not converge to 
any S, we say that the series diverges. 


Example 13.1.6. The series 1+ 1+1-+--- diverges. 


Proof. The partial sums of this series are: S$} = 1, Sp = 2, $3 = 3, and, for each 
natural number n, S$, = n. We want to show that there is no S that satisfies the 
definition of sum (13.1.4) for this series. Let S be any real number. If n is sufficiently 
large, then S, will be much larger than S. To show that S is not the sum of the series, 
it suffices to find any e > O for which there is no WN satisfying Definition 13.1.4. For 
this particular series, in fact, every ¢ > 0 has that property. For instance, take e = 3. 
No matter what N is chosen, there are an infinite number of n’s greater than N for 
which S, is greater than S + 3. For those S,, it is not the case that |S, — S| < 3. 
Therefore, the series diverges. Oo 


Example 13.1.7. The series 1—-1+1—1+1-—1+-.-- diverges. 


Proof. Consider the partial sums of this series: $; = 1, Sz = 0, 83 = 1, and so on. 
That is, the odd partial sums are all | and the even partial sums are all 0. To show that 
there is no S to which the series converges, it suffices to find some ¢ > 0 for which 
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there is no N satisfying Definition 13.1.4. In this example, ¢ = t (for instance) has 
that property. For, no matter what N is chosen, there will exist ann, > N (any odd 
ny > N)andann2 > N (any even nz > N) such that S,, = 1 and S,, = 0. There 
is no real number S that is within 5 of both | and 0. Thus, the series diverges. oO 


The proofs of many results concerning infinite series require an important 
inequality concerning absolute values. It is called the “triangle inequality” because 
its generalization to vectors in the plane is equivalent to the fact that the sum of the 
lengths of two sides of a triangle is greater than or equal to the length of the third 
side (see Theorem 14.4.9 in Chapter 14). 


The Triangle Inequality 13.1.8. Let x and y be real numbers. Then 


x+y] < lx] + ly 


Proof. Recall from our discussion about absolute values that if d > 0, then |a| < d 
is equivalent to —d < a < d. Therefore, we can prove the Triangle Inequality 
by showing that —(|x| + |y|) < x + y < |x| + |y|. For this, observe that for 
every real number a, —|a| < a < |a|. Thus, in particular, —|x| < x < |x| and 
—ly| < y < |y|. Adding these two inequalities gives us the desired inequality, 
=(x| + spe rey = la] [y- Oo 


A fundamental question is: can a series have two different sums? That is, can 
a series converge to S and also converge to 7, with S different from 7? Our first 
application of the triangle inequality is in providing an answer to this question. 


Theorem 13.1.9. An infinite series converges to at most one real number. That is, 
if a series converges to S and also converges to T, then S = T. 


Proof. Let S, be the n" partial sum of a given series, and suppose that the series 
converges to S and also to T. We will show that, for every ¢ > 0, |S — T| < e. Since 
0 is the only nonnegative real number that is less than every positive real number, it 
will then follow that |S — 7| = 0; that is, S = T. 

Let ¢ > 0 be given. For every n, the Triangle Inequality (13.1.8) implies 


|S— T| = |S— Sn + Sn — T| = |S — Sn) + Sn — T)| S |S — Sul + [Sn — T| 


Since the series converges to S, for every ¢; > 0 there is an Nj such that |S, — S| < 
&, for every n > Nj. Similarly, since the series converges to T, for every e2 > 0 
there is an N2 such that |S, —T| < €2 for every n > N2. Choose ¢; = 5 and €2 = 5 
If N is the larger of N; and N2, then both inequalities are satisfied for alln > N. 
Thus, forn > N, 


IS —Sp| + [Sp —T| <e, +e. = 


Therefore, |S — T| < €, as desired. a 


Despite the previous theorem, rearranging the order of the terms of an infinite 
series sometimes produces a series with a different sum (see Problem 19). That 
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is, the order in which the terms of an infinite series are added is important under 
certain conditions (see Problems 21 and 22). This surprising possibility is, of course, 
different from the situation when adding a finite number of numbers. It is therefore 
important to note that the definition of a particular infinite series depends not only 
on the terms of the series but also on the order in which the terms are arranged. 
However, if the terms of the series are all nonnegative, then rearranging the order of 
the terms does not affect the sum (see Problem 22). 


13.2. Geometric Series 


The example 5 + ; + 7 +--+ that we considered above is a particular instance of 
a special kind of series that is easily dealt with. 


Definition 13.2.1. A series a; +a2+4a3--- is called a geometric series if there is a 
number r, called the ratio of the series, such that each term of the series is obtained 
from the preceding term by multiplying by r. That is, there is a number r such that, 
if the first term of the series is a, then the second term is ar, the third term is ar2, 
and so on. Such a series has the form 


ata par feta ee 


for some real numbers a and r. 
The series that we previously discussed, 


oe 
2° 4° 8 


1 
DB} . 

We begin the analysis of geometric series by finding a formula for the nh partial 
sum of a general geometric series. 


is a geometric series with first term a = 5 and ratio r = 


Theorem 13.2.2. [fa is a real number and r is a real number other than 1, and if 


S, =a+ar+ar? +ar?+---+ar"! 


then 


a—ar" 


S,= 
. l-r 


Proof. The proof is similar to that of the special case that we discussed above (where 
a= , and r = 5). Since 


rSy =ar-+ar? +-+--+ar" 
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it follows that S$, — rS, = a — ar” (since all of the other terms cancel each other 
a—ar" 


out when doing the subtraction). Thus, (1 — r)S, = a — ar", or S, = i, (Note 
that this formula for S,, does not make sense when r = 1. Of course, S,, is simply 
na in this case.) oO 


We will use the above formula to determine which geometric series converge and 
to find the sum of a geometric series when it exists. We need the fact that if a number 
has absolute value less than 1, then its powers get arbitrarily small. That is, we need 
the following lemma. (Like ¢, the Greek letter 5, read “delta,” is often used to denote 
a small positive number.) 


Lemma 13.2.3. [f |r| < 1, then for every positive number 6 there is a natural 
number N such that |r|" is less than 6 for alln > N. 


Proof. Ifr = 0, then r” = 0 for all n, and therefore any N will do. If r is not 0, then 
|r| < 1 implies that Wl is greater than 1. Define the positive number f by t = a —1, 


so Wl = 1++t. We need the fact that, for every natural number n, (1 + ¢)” is greater 

than or equal to 1 + nt (which can be proven easily by mathematical induction, as 

stated in Problem 5 of this chapter). Now let 6 > 0 be given. Choose any WN that is 

large enough so that Nf is greater than 7 Suppose that n > N. Thennt > Nt > 7 
n 

Thus, (1 +2)" > (1+at)>1+1>1 s00+45"> + or (41) = phot. 

This implies that |r|" < 6, as desired. oO 


Theorem 13.2.4. Jf a is a real number and r is a real number with |r| < 1, then 
the geometric series a + ar + ar? +-+»+ar? 14... converges to a 

Proof. The n" partial sum of the series is S, = a “ (Theorem 13.2.2). According 
to the precise definition of convergence (13.1.4), the theorem will be proven if we 
establish that for every ¢ > O there exists an N such that [Sn = 74| < € whenever 


n => N. Note that the difference between +“ and S;, is 


a a-—ar" ar 


l-r l-r l-r 


Let e > 0 be given. If a = 0, then ar = 0. Assume now that a # 0. By 
Lemma 13.2.3, there is an N such that for all n > N, |r|” is less than the positive 


number a - €. Then, for alln > N, 


a ar” a a l-r 
Sn = = | | . \r”| < | | ‘ -Ee=eé 
l-r 1- l-r l-r fal 
Therefore, ++ is the sum of the series. oO 


The example with which we began is a special case of the above. 
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Example 13.2.5. 5+4+ g+:-=1 


Proof. This is the special case of Theorem 13.2.4 where a = 5 and r 5 


There are, of course, many other geometric series. 


Example 13.2.6. /¥7+ 4@t4 “404 M474... = 7 


Proof. This follows from Theorem 13.2.4 using a = J17 andr = i. The sum is 
a__ VI7 _ 317 Oo 
I-r = (2) =a 


Geometric series can have negative ratios. 


Example 13.2.7. 1-5 +4-%+4 7°" =3 


Proof. This is the case of Theorem 13.2.4 where a = | andr = —5. The sum is 
1 1 2 


The proper interpretation of .333333 ... is as an infinite series; it is another way 


of writing the series a + ia + cat +--+. Hence, we can determine its value. 
Example 13.2.8. The infinite decimal .33333... is i. 
Proof. This follows from Theorem 13.2.4 with a = a andr = b: the sum is 


(io) _ Gis) 1 es 
(15) (wo) > 
We shall see in Section 13.6 that every infinite decimal .b}b2b3 ..., where each 
b; is an integer between 0 and 9, is a convergent series (Theorem 13.6.2), although 
most infinite decimals are not geometric series. 


13.3 Convergence of Related Series 


In some cases the convergence of a series can be established by using the fact that a 
related series is known to converge. 


Theorem 13.3.1. [fa, +a2+a3+--- is an infinite series that converges to S and 
c is any real number, then the infinite series ca, + cay + ca3 +--+ (obtained by 
multiplying each of the terms of the original series by c) converges to cS. 


Proof. We want to show that the sum of the series cay + ca2 + ca3 +--- is cS. 
By Definition 13.1.4, we need to show that, for every ¢ > 0, there exists a natural 
number N such that the absolute value of the difference between cS and the n™ 
partial sum of the series is less than ¢ for every n > N. In the case where c = 0, 
the series obviously converges to 0- S = 0. We therefore assume that c 4 0 in what 
follows. 
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Let S,, be the n"™ partial sum of the series aj + a7 +. a3 +---. Then the n" partial 
sum of the series ca, + caz + ca3 + --- is cS,. Therefore, we need to consider 
|cS, — cS|. But this is equal to |c($, — S)| = |c| - |S, — S|. This enables us to 
prove the theorem, as follows. Let e > 0 be given. Since a; + a2 + a3+4+--- 
converges to S, there is a natural number N such that S,, is within the positive number 
Tel of S for every n > N. That is, |S, — S| < id for every n > N. Therefore, 


|cSn — cS| = |c| - |Sn — S| < |e|- id = e for every n > N, so Definition 13.1.4 is 


satisfied by cS. oO 
Example 13.3.2. The infinite decimal .66666... is equal to 5 and the infinite 
decimal .99999 ... is equal to 1. 


Proof. We could establish both of these facts using Theorem 13.2.4, as we did in 
Example 13.2.8. But, since we already know that the infinite decimal .33333... is 
i we can use the theorem we just proved (13.3.1) to get the result even more simply. 
The infinite decimal .66666... is twice the infinite decimal .33333.... More 
precisely, each of the terms of the infinite series representing .66666... is equal to 


twice the corresponding term of the infinite series that represents .33333.... Thus, 
by Theorem 13.3.1, .66666... is equal to 2 - 5 = z. Similarly, since .99999 ... is 
3 times .33333..., Theorem 13.3.1 implies that .99999... is 3 - 5 = 1. oO 


Theorem 13.3.3. Ifa, + a2 +43+--- is an infinite series that converges to S and 
bi + b2 + b3 + --- is an infinite series that converges to T, then the infinite series 
(a1 + b1) + (22 + bo) + (a3 + 53) +--+ converges to S + T. 


Proof. We must show that the sum of the series (a1 +b1)+(a2+b2)+(a3+b3)+:-- 
is S+ T. By Definition 13.1.4, we need to show that, for every ¢ > 0, there exists a 
natural number N such that the absolute value of the difference between S + T and 
the n" partial sum of the series is less than ¢ for every n > N. 

Let S, be the n"® partial sum of the series a; + a2 + a3 +---, and let T;, be the 
n' partial sum of the series b} + b2 + b3 +--+. Then S, + Ty is the n" partial sum 
of the series (a) +51) + (a2 + b2) + (a3 + b3) +--+. Therefore, we need to analyze 
(Sn + Tn) — (S+T)|. First, |(Sn + Tn) — (S+T)| = [Sn — S) + In — T)]. By the 
Triangle Inequality (13.1.8), |(S, — S$) + (In — T)| < |Sn — S| + |T, — T|. Thus, 
I(Sn + Tn) = (S+ T)| < [Sn _ S| Tr IT _ T|. 

This inequality enables us to prove the theorem, as follows. Let ¢ > 0 be given. 
Since a; + a2 + a3 +--- converges to S, there exists an integer Nj such that 
[Sn — S| < 5 for every n > Nj. Since b} + b2 + b3 +--+ converges to T, there 
is an integer N2 such that |T, — T| < a for every n > N2. Thus, if we let N be 
the larger of Nj and N2, then both of these inequalities hold whenever n > N. 
Therefore, |(Sn + Tn) — (S+T)| < [Sn — S| +1Tn —T| < 5 +5 = € for all 
n>N. oO 


The above theorem can be summarized “convergent series can be added term-by- 
term.” 
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13.4 Least Upper Bounds 


In order to investigate infinite series other than geometric series, we need to 
understand an important property of the real numbers called “The Least Upper 
Bound Property.” 


Definition 13.4.1. If S is a set of real numbers, then the real number ¢ is an upper 
bound for S if t is greater than or equal to every element of S. That is, ¢ is an upper 
bound of S if x < t for every x in S. 


of {—1, —2, —3, ...}; 5 is an upper bound of { —4, -4, — §,...}sand 28 is 


For example, 2 is an upper bound of the set {1, 7 7 Pek 1; —1 is an upper bound 
3 
an upper bound of {x : x < /2}. 
It is important to observe that if ¢ is an upper bound of a set S, then every 
real number greater than ¢ is also an upper bound of S. Thus, if a set S has an 


upper bound, it has infinitely many upper bounds. Some sets, however, such as 
{1,2,3,4,5,...} and {(v2)" Ine NI, do not have any upper bounds at all. 


Definition 13.4.2. If S is a set of real numbers, then the real number t¢ is a least 
upper bound of S if t is an upper bound of S and every upper bound of S is greater 
than or equal to t. That is, a least upper bound is a smallest upper bound. 


For example, a least upper bound of ‘Be 5 7 sath } is 1. This can be seen as 


follows. Since every number in the set is less than or equal to 1, | is an upper 
bound. If ¢ is any upper bound, then ¢ is greater than or equal to 1 (since | is in the 
set), so 1 is a least upper bound. 

A least upper bound of {—1, —2, —3, —4,—5,...} is —1. To see this, first 
observe that every number in the set is less than or equal to —1, so —1 is an upper 
bound. Every upper bound must be greater than or equal to —1 since —1 is in the 
set. 

A least upper bound of {3 - 5 3 -— i 3 -— ie Saihs } is 3. To verify this, begin by 
noting that, since 3 is greater than any number in the set, 3 is an upper bound. To see 
that it is a least upper bound, note that if t is any upper bound, then ¢ must be greater 
than or equal to 3 — i for every natural number n. If t was less than 3, then there 


would be some n such that t < 3 — i (just choose n such that i < 3—Trf). Thus, t 
would not be an upper bound. Therefore, all upper bounds are greater than or equal 
to 3, so 3 is a least upper bound. Note that, in this example, 3 is a least upper bound 
of the set but is not in the set. 

A least upper bound of {x op se } is 2. To see this, note that every number 
in the set is less than 2, so V2 is an upper bound. If ¢ is a number less than /2, 
then ¢ + 5(V2— t)= 5(t + 2) < 5(V2 + V2) = ./2. Therefore t + 5(V2— t) 
is in the set. But tf + 5 (v2 - t) is obviously greater than f, since 2 — f is positive. 
Therefore such a ¢ is not an upper bound for the set. Thus, /2 is a least upper 
bound. 
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Suppose that S is the empty set; that is, the set that does not contain any elements. 
Then every real number is an upper bound for S, for, no matter what real number 
is given, S does not contain any numbers greater than it. Since every real number is 
an upper bound for the empty set, the empty set does not have a /east upper bound. 

A crucial property of the real numbers that we will need in order to understand 
infinite series, and which is also important in many other contexts, is the existence of 
least upper bounds. We assume as an axiom that the real numbers have this property. 
(In fact, this property can be proven from the Dedekind cuts construction of the real 
numbers in terms of sets of rational numbers; see Problem 26 at the end of this 
chapter.) 


The Least Upper Bound Property 13.4.3. Every nonempty set of real numbers 
that has an upper bound has a least upper bound. In other words, the set of upper 
bounds has a smallest element. More precisely, if S is any set other than the empty 
set and S has an upper bound, then there is an upper bound to for S such that every 
upper bound for S is greater than or equal to to. 


We next show that a set cannot have two different least upper bounds. 


Theorem 13.4.4. [fa nonempty set of real numbers has an upper bound, then the 
set has a unique least upper bound. 


Proof. By the Least Upper Bound Property (13.4.3), every nonempty set of real 
numbers that has an upper bound has a least upper bound; we must show that there 
is at most one least upper bound for a given set. Suppose that both ft; and f are least 
upper bounds of the same nonempty set. Then, since t; is a least upper bound and fz 
is another upper bound, t; < f. Similarly, since f2 is a least upper bound and 1; is 
another upper bound, f2 < f;. Therefore, t) = ta. oO 


Theorem 13.4.5. Jf an infinite series converges, then the set of partial sums of the 
series has an upper bound. 


Proof. Suppose that an infinite series converges to the sum S. Then, by Defini- 
tion 13.1.4, for every ¢ > 0 there exists an N such that |S, — S| < ¢& whenever 
n > N. In particular, there is such an No for ¢ = 1. That is, the absolute value 
of S, — S is less than 1 whenever n is greater than or equal to No. It follows, in 
particular, that for all n > No, S, —S < 1. Therefore, S, < 1+ for every n > No. 
Thus, we have found an upper bound, | + S, for the set of all partial sums except 
for the first No — 1 of them. Let ¢ be the largest of the numbers in the finite set 
{S1,.$2,..., SN}. Then the larger of 1 + S and f is an upper bound for the set of all 
partial sums. Oo 


The converse of Theorem 13.4.5 is not true in general, as the next example shows. 


Example 13.4.6. The set of partial sums of the series 1 - 1+ 1—1+4.--- has an 
upper bound, but the series does not converge. 


Proof. In Example 13.1.7, we showed that this series diverges. Since the set of 
partial sums is simply {0, 1}, 1 is an upper bound for the set of partial sums. Oo 
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13.5 The Comparison Test 


In the case where all of the terms of an infinite series are nonnegative, the converse 
of Theorem 13.4.5 is true. 


Theorem 13.5.1. [fa}+a2.+a3+--- isan infinite series with a; > 0 for alli, then 
the series converges if and only if the set of all partial sums has an upper bound. 


Proof. Convergence implies that the set of partial sums has an upper bound, 
by Theorem 13.4.5. The proof of the converse requires the Least Upper Bound 
Property (13.4.3). To begin the proof, assume that the set of partial sums has an 
upper bound. Then, by the Least Upper Bound Property, the set of partial sums has 
a least upper bound. Call this least upper bound S; we prove that the series converges 
to S. For this, we must show (Definition 13.1.4) that, for every ¢ > 0, there is an NV 
such that |S, — S| < ¢€ whenevern > N. 

Let e > 0 be given. Since « is greater than 0, S — ¢ < S. Since S is the least 
upper bound of the set of partial sums and S — «¢ is less than S, S — € cannot be an 
upper bound of the set of partial sums. Thus, there is some N such that S—e < Sy. 
We show that this NV satisfies the definition of convergence (13.1.4) for the given «. 
The hypothesis that each term of the series is nonnegative implies that ifn; < no, 
then S;, < Sn,, since S,, is obtained by adding nonnegative numbers to S,,. In 
particular, Sy < S, whenever n > N. Also, since S is an upper bound for the set of 
partial sums, S, < S for every n. Therefore, when n > N, 


S—e<Sy <S,<S 


In particular, S— ¢ < S, < $+ ¢ for every such n. Subtracting S from each term in 
this inequality yields —e < S, — S < e, which is equivalent to |S, — S| < e. Thus, 
Sp is within ¢ of S for every n > N, and the theorem follows. oO 


Example 13.5.2. The series 


1 1 1 1 1 


1-0" 9.02 "3.9 2007 oe 


converges. 


Proof. First, — is less than or equal to aa for every natural number n. Thus, every 
partial sum of this series is less than or equal to the corresponding partial sum of 


the geometric series 5 + + + a +--+. Since the sum of the geometric series with 


ratio 5 and first term 5 is 1 (Example 13.2.5), it follows that the set of partial sums 
of both series have 1 as an upper bound. Therefore, by Theorem 13.5.1, the series 


converges. | 


The previous example is a special case of a general situation: A series with 
nonnegative terms must converge if it is “term by term” less than or equal to a 
convergent series. That is, the following theorem holds. 
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The Comparison Test 13.5.3. Suppose 0 < dy < by for every natural num- 
bern. 


(i) If the series b} + by + b3 +--+ converges, then the series aj + a2 +a3+--- 
converges. 

(ii) If the series a, + az +a3+--- diverges, then the series bi + bz + b3+--- 
diverges. 


Proof. (i) Let S be the sum of the series bj + bz + 563 +---. It is clear that every 
partial sum of the series aj + a2 + a3+.--- is at most S. Thus, by Theorem 13.5.1, 
the series aj + a2 + a3 +--- converges. 

(ii) If b} +b2+b3+--- did converge, then a; + a2 +a3+--- would converge, 
by part (i). Therefore, bj + bz + b3 +--+ diverges. Oo 


The following is a very easy application of the Comparison Test. 


Example 13.5.4. The series oy + wp + 753 + po feeet ats +--+ converges. 


Proof. This series is clearly term by term less than the convergent geometric series 
I + ti + al. ae oO 
i ai ; 

A slightly more complicated application is the following. 


Example 13.5.5. The series 


ee eae naar rear 
3" 32 ' 33" 34 3n 


converges. 


Proof. We will establish that 37 is less than a for every natural number n; the 


convergence of the given series then follows from part (1) of the Comparison Test 
(13.5.3) by comparing the given series to the geometric series 5 + i + Z tere. 
The fact that 3; is less than + for every natural number n can be established by 


the Generalized Principle of Mathematical Induction (2.1.4). To see this, note that 


n 
on < + is equivalent to (3) < i. This is true for n = 1 and also for n = 2. Now 


when k > 2. It follows 


k 
suppose that (3) < i for some k > 2. Note that 5 < at 


that 


an a\F¥ 2 ft & 1 
pa = * < : — 
3 a) 4° % baa Fad 


Therefore, a < aa for all n > 2. Thus, part (i) of the Comparison Test (13.5.3) 
gives the result. Oo 
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13.6 Representing Real Numbers by Infinite Decimals 


We show that every infinite decimal represents a real number, and, conversely, that 
every real number has a representation as an infinite decimal. 
We define nonnegative infinite decimals as follows. 


Definition 13.6.1. A nonnegative infinite decimal is an expression of the form 
M.a\a2a3..., where M is a nonnegative integer and each a; is a “digit” (i.e., a 
number in the set 0, A a We a Acwae such an expression as representing the 
infinite series M + 3 tas rie peer + iw +- 


Theorem 13.6.2. Every nonnegative infinite decimal converges. 


Proof. We must show that the infinite series M +5 o+ i 2 +-- a ger - converges 
whenever M is a nonnegative anieeet and each ax is a digit. Since. aac ax is a digit, 
Tae is less than or equal to id , for every k. It follows that the infinite decimal is 
term by term less than or equal to the “comparison series” 


ig ears 
10° 102 * 103 
(10) 


This comparison series converges to M+ ( ; 
mh) 
10 


eee hl 


= M-+1, since its partial sums are 


all of the form M plus a partial sum of the convergent geometric series 3 + mz + 


ce +--+. Therefore, part (i) of the Comparison Test (13.5.3) gives the result. Oo 


Negative infinite decimals can be defined as negations of nonnegative ones. For 
example, —7.11... = —(7.11...). In general, when M isa ay 4 eect and 
each ax is a digit, we define —M.a,a2a3... tobe (—1)- (M+ iar re + i +. --). 


Theorem 13.6.3. Every real number can be represented as an ne decimal. 


Proof. First, the infinite decimal .0000... represents the real number 0. We next 
consider the case where r is a positive real number. We can obtain a decimal 
representation for r as follows. Let M be the greatest integer that is less than or 
equal to r. Such an M will be O or a natural number. Note that r — M is less than 
1, since otherwise M ] 1 would be an integer less than or equal to r. Let a; be the 
wae digit such that Tt 4 is less than or uae tor — M. Thenr — (M + a) is less 
than 75 9: Let a2 be the ee digit such peat i 162 is less than or equal to r — (M + a). 
Then r r—-(M+h+ io) is lees than 
qo is less thanr — (M+ 15 +; aa) 

Continue Consuuenne sis pa in this manner. es for each k, the absolute 
value of r — ae + i ae a re fee+ te) i is less than —,. We claim that the infinite 


10 
series M a 4+ aoe + io +--+ converges to r. To see this, let ¢ > 0 be given and 


7 ; then let a3 be the largest digit such that 
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choose N large enough that 10~! is greater than i If S, is the n™ partial sum of 


the series M+ + @ S 4... andn > N, then 


To?" 103 
a | a Gn-1 1 1 
Ir —S,| = r (M+ 54 S44) |< sger< 


Thus, the series converges to r. 
If r is a negative real number, an infinite decimal representation of r can be 
obtained by multiplying an infinite decimal representation of —r by —1. Oo 


Thus, real numbers can be represented by infinite decimals. However, the 
representation of a real number by an infinite decimal is not necessarily unique. 
Some real numbers have two distinct representations. For example, summing the 
corresponding infinite series shows that .299999... equals .300000.... More 
generally, every real number represented by an infinite decimal that ends with an 
infinite string of 9’s also has a representation that ends with a string of 0’s. That is the 
only way that a real number can have two distinct infinite decimal representations. 
Before proving this fact, we make the following observation. 


Lemma 13.6.4. The infinite decimal .cjc2c3 .... is at most 1. Moreover, if any cx is 
less than 9, then .c,c2c3... is less than 1. In particular, .c,c2c3.... is equal to 1 if 
and only if cx = 9 for every k. 


Proof. The infinite series that .cjc2c3... represents is term by term less than or 
equal to the geometric series a + = + a +.---. Since that geometric series sums 
to 1, every partial sum of 5) + 7 + 74; +--+ is at most 1. Therefore, the infinite 


decimal .cjc2c3... is at most 1. If cx is strictly less than 9, then every partial sum 


of B+ 4 4... is less than 1 — SO .c1C2C3... is at most 1 — Oo 


1 
102 103 10%’ 10k * 


Theorem 13.6.5. If two different infinite decimals represent the same real number, 
then one of them ends in a string of 9’s and the other ends in a string of 0's. 


Proof. Clearly, it suffices to prove the theorem for representations of positive real 
numbers. Suppose, then, that a positive number has distinct representations 


M.ajaza3... = N.bibob3... 
(Saying that the representations are distinct means that they differ in at least one 
digit.) Consider the case where M and N are different. Since M ¥ N, one of them 
is larger. Suppose that M is greater than N; that is, M — N > 1. On the other hand, 
M—WN <1 since 
M—WN <(M—N)4+.q,\a0@...= M.ajana3...— N = .bbob3... 


which we know from Lemma 13.6.4 is at most 1. Thus, M — N = 1, and the above 
gives 


1< 14+ .aqjajgqa3...= .bjbob3...< 1 
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This implies that .b}b2b3... = 1 and .a)}a2a3--- = 0. Therefore, every ax is 0 and, 
by Lemma 13.6.4, every b; is 9. 

Now suppose that M = N and that the n"" decimal place is the first decimal place 
where the two representations differ; that is, a; = b; for all j less than n and a, is 
different from b,. Multiplying by 10” yields distinct representations 


An -An414n420n43 »»» = Dy-bn4ibn+2bn43 .-- 


Since dy, 4 by, one of them is larger; suppose that a, is greater than b,. Then, the 
first case implies that a, = O and by = 9 for every k > n + 1. This proves the 
theorem. oO 


One possible way of constructing the real numbers from the integers is to 
use infinite decimals. However, it is not easy to describe the basic arithmetic 
operations of addition and multiplication in terms of infinite decimals. Another way 
of constructing the real numbers is outlined in Problem 15 in Chapter 8. 


13.7 Further Examples of Infinite Series 


Definition 13.7.1. The harmonic series is the series 


eek tean aera ee vee 
2° 3 4 n 


Theorem 13.7.2. The harmonic series diverges. 


Proof. This will follow from Theorem 13.4.5 if we show that the set of partial sums 
does not have an upper bound. We do this by establishing that, for every natural 
number M, there is a partial sum that is greater than 5 -M. 

Begin by observing that s+ ; is greater than ; + i = 5. Similarly, a+ i + 4 + t 
is greater than 7 + ; + 7 + 7 = 7 Moreover, 


a ; apes ; 
9 10 16 
is greater than 
1 4 1 based i 81 
16 16 16 16 2 


In general, for every natural number k, each of the terms of the harmonic series from 
sti to x is at least 7 and there are 2 — 2k-! — 2k-1. (2 — 1) = 2! such 


terms. Therefore, the contribution of the sum of those terms to the harmonic series 


: k-1. 1 _1 
is at least 2 3e = 7: 
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The above shows that, for every natural number k, the partial sum Sx is at least 
5 -k. Thus, for every natural number M, there are partial sums of the harmonic series 


which are greater than 5 - M, so the series diverges. oO 


Example 13.7.3. The series 


diverges. 
1 


vn 
diverges by part (ii) of the Comparison Test (13.5.3), comparing it to the harmonic 
series (Theorem 13.7.2). oO 


Proof. For every natural number n, ./n < n, so > ‘. Therefore, the given series 


Example 13.7.4. The series 


1 1 1 
L353 “se 


converges to 1. 


Proof. By Problem 2 in Chapter 2, the n partial sum of this series is mae It is 


apparent that this is close to 1 ifn is large. To formally establish this, let e be any 
positive number. If N is a natural number such that N + 1 > i, thene > — 


N+1° 
Therefore, ifn > N, then |1 <é. oO 


n_ | _1 1 
mall ~~ n+l = N+1 
Example 13.7.5. The infinite series 


1 1 
It+atatat "og rt 
converges. 
Proof. Since the partial sums of 
1 1 1 1 
La ga go a 
are obtained by adding | to those of 
: ++ : taal er yaa 
2232 n? 


it suffices to show that this latter series converges. We will establish this by 
comparison to the convergent series from Example 13.7.4. The series 


id 1 
g2° 32 (n + 1)? 
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is term by term less than the series 


1 1 1 
io 33° “soe” 
since way? is less than TOES for every natural number n. Therefore, the series 
1 1 1 


converges by the Comparison Test (13.5.3). oO 


It can be shown, using calculus, that the series in the above example converges 
2 
to =. 
Theorem 13.7.6. For every p > 2, the series 
1 1 1 


Lee tay er 


converges. 


Proof. This follows immediately from the previous example and the fact that p > 2 
implies that aa < 4, for every natural number n. Oo 


It can be proven (using integral calculus) that, in fact, 1+ + + a +--+ converges 
for every p > 1. 

Infinite series are often used to define specific numbers. For example, the famous 
number e, the base of the natural logarithm, can be defined as an infinite series. (It 
can also be defined in many other ways.) 


Example 13.7.7. The series 1 + t + 4 + + fore + A +--+ converges. The sum 
of the series is denoted by e. 


Proof. An easy application of the Generalized Principle of Mathematical Induction 
shows that n! > 2” for all n > 4 (see Theorem 2.1.5 of Chapter 2). Thus, Zi < a 


for alln > 4. Hence, t + a + a +--+ converges by the Comparison Test (13.5.3), 
and therefore so does the entire series. Oo 


We next consider a particularly interesting example of a divergent series. 


Theorem 13.7.8. Let pj denote the j'" prime number (so that py = 2, pr = 3, 
p3 = 5, etc.). Then the series 


SE eee ede 
2 3 5 Pj 


diverges. That is, the series of the reciprocals of the primes diverges. 
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Proof. This can be proven in several ways. The proof that we present below appears 
to be the easiest, although it is somewhat tricky. We show that the assumption that 
the series converges leads to a contradiction. 

Suppose that the sum of the series was S. Then, by the definition of convergence 
pen 13.1.4), there would be a partial sum of the series, say Sx, such that Sx is 
within 5 of S. It would then follow oe the sum of the series obtained by discarding 
the fist k terms would be less than 4 . That is, there would be a natural number k 
such that 


is less than 5 . We proceed to show that this is impossible. 

For the rest of the proof, we fix such a k and say that p; is a “small prime” if j 
is less than or equal to k and that p; is a “big prime” if 7 is greater than k. For each 
natural number x, let N(x) denote the number of natural numbers that are less than 
or equal to x and are not divisible by any big prime. The surprising trick involves 
obtaining, and using, an upper bound on N(x). 

Fix any natural number x. We can get a crude upper bound for N(x) as follows. 
Every natural number y that is counted in N(x) can be written as a product uv, 
where u is a perfect square and v does not have any perfect square divisors. To 
see this, use the prime factorization (Corollary 4.1.2) of y to factor out the biggest 
perfect square uw; v is the other factor of y. Note that u = 1 if | is the largest perfect 
square that divides y, and v = 1 if y itself is a perfect square. The number of distinct 
u’s that arise from y’s that are counted in N(x) must be less than or equal to ./x, 
since each u is less than or equal to x and is therefore the square of a number that 
is less than or equal to ./x. Also, each of the v’s consists of a product of small 
primes raised to at most the first power. There are k small primes, so the number of 
possible v’s is at most 2* (since each small prime may or may not occur in the prime 
factorization of each v). Every y that is counted by N(x) is of the form wv and there 
are at most ./x u’s and 2* y’s, from which it follows that N(x) < 2*,/x, for each 
x. We will use this inequality to derive a contradiction. 

Next, we establish a lower bound for N(x) that will be inconsistent with the 
above inequality when x is large. First, for any prime p and any natural number x, 
there are at most = multiples of p that are less than or equal to x. Thus there are 
at most ;- fattal numbers less than or equal to x that have the big prime pj; as 


a factor. coun that N(x) denotes the number of natural numbers less than or 
equal to x that have only small prime factors. Therefore, x — N(x) is the number 
of natural numbers less than or equal to x having at least one big yee as a factor. 
There are at mos 


at most ar of foe that have the big prime px+2 as a factor; and so on. Thus, 


: x e 7 ; . 
x — N(x) is less than or equal to ia) we oe Each partial sum of this 


series is x times a partial sum of the series mat pee ee Since the sum 


of the latter series is less than 5 it follows that x — N(x) is less than 5 
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The inequality x — N(x) < 3 is equivalent to 5 < N(x). Combining this with 


the previous inequality we obtained for N(x) gives 
a k 
aa N(x) 3 2°4/e 


Therefore, 


5 <a 


Multiplying both sides of this inequality by 2 and dividing by ./x yields 
fx < gk+l 


Thus, assuming that the series converges leads to the conclusion that there is a 
natural number k such that ./x < 2**! for every natural number x. Now we have 
our contradiction: Since k is fixed, the above cannot hold for all natural numbers x. 
For example, if x = 27*+4, then /x = 2*+?, which is larger than 2‘+!. o 


The preceding provides another proof that there are infinitely many primes; if 
there were only a finite number of primes, the series consisting of the reciprocals 
of the primes would have only a finite number of terms, and therefore would 
converge. 

In Chapter 1, we mentioned the famous unsolved twin primes problem. While 
(in spite of dramatic progress by Yitang Zhang and others, beginning in 2013) the 
twin primes problem is still unsolved, Viggo Brun proved in 1919 that the sum of 
the reciprocals of the twin primes converges. (The proof of this is well beyond the 
scope of this book.) 

There is much more that is known about infinite series. Some other results are 
outlined in the interesting and challenging problems below. Almost every calculus 
book contains a large amount of related material. 


13.8 Problems 


Basic Exercises 


1. Find the sum of each of the following geometric series: 
1,1 1 
(a) Lg hag = ag reas 
(b) 10+ 7+ 54+B4+.-- 


Bi hig a i a Ba 8 
© 2+ Cay * Gy * Oy 
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2. For each of the following sets, determine if the set has an upper bound, and, if 
so, find the least upper bound: 


oe ear a me oa 
11 
(b) {1, —- a3 Jes =f 
(c) {-l, - La Lae] J 
(d) {—7, 12, —4, 6} 
3. Determine which of the following series converges: 
(a) 1-F+4 nt 
(b) V2+789+ 2042+ 34+ 54+ 454+--+5%4+-° 
(c) 1+.14+.14.14- 


@ issebebedegetade. 
(ec) 1+ 4 +3 at rtatatetat 


OQ at+z 3 rt+t 


4. For each of the iets infinite decimals, determine the rational number that it 
represents: 


a) FITTTTT osc 
(b) .3434343434... 
(c) 17.389389389389... 


5. Let t be any positive real number. Use the Principle of Mathematical Induction 
to prove that (1 +1)” > 1+ -nt for every natural number n. (This result was used 
without proof in Theorem 13.2.3.) 

6. Determine which of the following series converge: 


() +3 4+5+5t+24+--4+5H+-° 
6) s+gt+ytctmgt 


Interesting Problems 


7. Determine which of the following series converge: 


(a) 19+ ant taint “+ +... 
O) gtgtotet ete th tata 


8. Determine ne caiead ae that is represented by each of the following 
infinite decimals: 


(a) 6.798345345345345345... 
(b) —38.0006561234123412341234... 
(c) .012345678901234567890123456789.. . 
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9. 


10. 


Suppose that the sum of the series aj + a2 + a3 +---+a,+--- is S and the 
sum of the series bj + bz + b3 +---+b, +--- is T. Prove that the sum of the 
series 


(a — by) + (az — b2) + (a3 — b3) +++ + (Gn — bn) +°°> 


is S—T. 

(“Sigma notation”) There is a standard notation that is often used when 
considering infinite series and in many other contexts. The Greek letter )> 
(called “sigma’’) is part of a shorthand for representing sums, as illustrated by 
the following examples: 


4 
(i) >° a; is defined to be aj + a2 +a3 +.a4; it can be read “the sum fromi = 1 
i=l 
0 4 of a;” 


(ii) x3 means 5 34+ a 3; it can be read “the sum from n = 3 to 5 of 5 he 


es 
(iii) S> j? means 5* +67 +77 +---+21*; it can be read “the sum from j = 5 
j= 
to 21 of j*” 
[o,@) 
(iv) x means 5 s+ a + 35 ae -+ x +--+; it can be read “the sum from 


t=1 
i = | to infinity of x” 


[oe 
Thus, >” is used as above to indicate sums. When we write )> a; we do 


not necessarily imply that the series converges; it is merely shorthand for the 
ae series aj + a2 + a3 + .---.If the series converges to S, we may write 


z a; = S. For example, x =1. 
i=l i 

(a) Find: ; 

i=l 

(b) Find: 3° (—1)'- G+ 4 
(c) Find: 3 t 
(d) Find: }°(-1)'- 2 
(e) Find: 3° (—1)! 


‘ 17 1 1 
(f) Find: “(t = 1) 
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n n 
(g) Show that m S* aj = S° ma;, where n is any natural number and m is any 
f=] i=1 
real number, 
n n n 
(h) Show that 62 «) + (= bi) = (a+). 
i=l i=l i=l 

11. Show that there is no least upper bound property for the set of rational numbers. 

In other words, show that there are nonempty sets of rational numbers that have 

rational upper bounds but which do not have a least rational upper bound. 


Challenging Problems 


12. Show that the series 1 + 5 + ‘ + ; +--+ diverges. 

13. (a) Suppose that the series aj + az + a3 + --- converges. Prove that, for each 
positive number 4, there is a natural number N such that |a;| < 6 for every 
i>N. 

(b) Suppose that there is a positive number 6 such that |a;| > 6 for infinitely 
many a;. Show that the series aj + a2 + a3+--- diverges. 
(c) Let a; = (—1)! nit Show that the series a] + a2 +.a3+--- diverges. 

14. (A form of the “Ratio Test’’) 

(a) Show that the series a; + a2 + a3 +--- with nonnegative terms converges 
if there is a positive number r < | such that a;+ 1 is less than or equal to ra; 
for every i. 
[Hint: Compare the given series to the geometric series aj + ajr + ayr? + 
at? Aen] 

(b) Show that the series aj + a2 + a3+.--- with nonnegative terms converges 
if there exist an N and a positive number r < 1 such that aj+1 is less than 
or equal to ra; for every i > N. 


15. (Absolute convergence implies convergence) The series aj + a2 + a3 +-:- 


is said to converge absolutely if the series |a,| + |a2| + |a3| +--+ converges. 
The following is an outline of a proof that a series converges if it converges 
absolutely. 


Suppose that |a1| + |a2| + |a3| +--- converges. 


(a) Show that 2|a;| + 2|a2| + 2|a3| +--- converges. 

(b) Use the Comparison Test (13.5.3) to show that (|a1| + a1) + (|a2| + a2) + 
(|a3| + a3) +--+ converges. 

(c) Prove that the series —|a;| — |a2| — |a3| — --- converges, and add it to the 
series in part (b) to show that aj + a2 + a3 +--- converges. 


: pe 3 -4 
16. Prove that, for every real number x, the series 1+ x +3 +47 + +°:: 
converges. (This is one definition of the exponential function e*, where e is the 


base of the natural logarithm. That is, e* = 1 + x + = + a + _ +--+.) 
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17. 


18. 


19. 


20. 


21. 


22. 


23. 


[Hint: Since absolute convergence implies convergence (Problem 15), it suffices 
to prove convergence for positive x. For this, use the Ratio Test (Problem 14) 
with N any natural number larger than x and with r = rea J 

(A form of the “Root Test’) Suppose that a; + a2 + a3 +.--- is a series with 
nonnegative terms. Prove that the series converges if there is a positive number 


1 
r < 1 and anatural number AN such that (a;)* <r foralli > N. 
The alternating harmonic series is the series: 


1 1 1 —1)"t! 
Pca as 


Iho Se SS eg ae 
a7 3 4" > 


n 


(a) Let 7 denote the set of even partial sums of the alternating harmonic series; 
that is, T = {1-5,1-5+4-4,1-3+4-444-4....}. Show 
that 1 is an upper bound for 7. 

(b) Let S be the least upper bound of 7. Show that the alternating harmonic 
series converges to S. 


(a) Show that the terms of the alternating harmonic series | — 5 + ‘ _ i +... 


can be rearranged so that the resulting series converges to 7. 
[Hint: Use the divergence of the series 1 + { + : +--- (Problem 12) to get 
the first partial sum that is greater than 7. Begin the rearrangement with that 
partial sum. Then, adding the first negative term, -5, will make the sum 
less than 7. Then add positive terms until the result is just greater than 7; 
then add negative terms to get less than 7, and so on.] 
(b) Let t be any real number. Show that the terms of the alternating harmonic 
series can be rearranged so that the resulting series converges to f. 
Give an example of a series which converges but does not converge absolutely 
(see Problem 15). 
A series is said to converge conditionally if it converges but does not converge 
absolutely. Prove that any conditionally convergent series can be rearranged to 
sum to any real number, and can also be rearranged so that it diverges. 
[Hint: First show that the sum of the nonnegative terms of the series diverges, 
as does the sum of the negative terms of the series. ] 
Prove that, if a series converges absolutely, then all the rearrangements of the 
series have the same sum. 
(Characterization of the rational numbers) A repeating infinite decimal is an 
infinite decimal of the form: 


L.aya oe *Amb 1 bo an - baby bo a - by bb . -+Dy aes 


where L is an integer and the a; and b; are digits. 


(a) Show that every repeating infinite decimal represents a rational number. 
(b) Show that every rational number has a representation as a repeating infinite 
decimal. 


190 


24. 


25. 


26. 


2d 
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A number f¢ is said to be a lower bound for a set S of real numbers if ¢ is less 

than or equal to x for every x in S. A number fo is a greatest lower bound for 

the set S if t9 is a lower bound for S and fo is greater than or equal to ¢ for every 

lower bound ¢ of S. Prove that every nonempty set of real numbers that has a 

lower bound has a greatest lower bound. 

{Hint: Consider the set 7 = {—x : x € S}.] 

(a) Prove that, if a series converges, then the set of all its partial sums has a 
lower bound (see Problem 24). 

(b) Prove that a series whose terms are all less than or equal to 0 converges if 
and only if the set of all its partial sums has a lower bound. 

(Construction of least upper bounds) For this problem we assume familiarity 

with the construction of the real numbers from sets of rational numbers using 

Dedekind cuts, as outlined in Problem 15 in Chapter 8. In that context, the real 

number A is said to be less than or equal to the real number B if A is contained 

in B. 

Suppose that a given nonempty set S of real numbers (which is a set of sets 

of rational numbers) has an upper bound. That is, there is a real number that is 
greater than or equal to every real number in S. Prove that the union of all of 
the real numbers in S is a real number and is the least upper bound of S. 
A more standard approach to convergence of infinite series begins with the 
definition of convergence of any sequence of real numbers. The definition 
of convergence that we have given (13.1.4) is the particular case when the 
sequence is the sequence of partial sums of a series. We have delayed the 
presentation of the more general definition of convergence of sequences because 
some people find it more confusing to learn the general definition when they 
are first exposed to this topic. However, the general definition is required in 
order to obtain many of the standard results on infinite series, and in many 
other situations. In this exercise, we state the definition of convergence of an 
arbitrary sequence of real numbers, and use that definition to reformulate and 
extend some of the results obtained in the chapter and in the problems given 
above. 

Some examples of sequences are: 


11 
to 
2° 3° 


1 

4’ 
V3, V4, V5, . 
es (ee eee 


In general, a sequence of real numbers is a listing of real numbers, one for each 
natural number. More precisely, a sequence of real numbers can be defined as 
an assignment of a real number to each natural number (that is, as a function 
from the natural numbers to the real numbers). For example, the sequence 
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Leet is the assignment of the number ‘ to each natural number n, 


’ 3? 3? 4? cee 
and the sequence |, —1, 1, —1, 1, —1,... is the assignment of | to every odd 
natural number and —1 to every even natural number. 
Notation such as x1, x2,x3,X4,... is often used to denote a sequence. 


Sometimes we abbreviate this as (x;,). The crucial concept is that of a limit of 
a sequence. The rough idea is that a real number L is the limit of the sequence 
(Xn) if x, is close to L when n is large. More accurately, no matter how close 
we wish to get x, to L, it will be that close if n is sufficiently large. The precise 
definition is the following. 


Definition 13.8.1. The sequence (x,,) converges to L (or has limit L) if, for every 
real number ¢€ greater than 0, there is a natural number N such that |x, — L| < ¢ 


whenever n > N. We use the notation lim x, = L to denote the fact that L is 
n—->oo 


the limit of the sequence (x,,); this is read “the limit as n approaches infinity of 
the sequence (x,,) is L.” Sometimes the notation (x,) — L is used. 


Note that applying this definition to a sequence (S,,) of partial sums of an 
infinite series yields Definition 13.1.4. That is, an infinite series converges to 
S if and only if the sequence of its partial sums converges to S. In studying an 
infinite series aj + a2 + a3 +---, there are several different sequences that 
naturally arise. The one that we have discussed so far is the sequence (S,,) of 
partial sums. Another is the sequence (a;) of terms of the series. It is important 
not to confuse the two. Limits of some other sequences, including some related 
to the terms of a series, also play a role. 


(a) Prove that, if a series converges, then its sequence of terms converges to 0 
(see Problem 13). That is, if aj +a +a3+--- converges, then lim a; = 0. 


1>Co 
(b) (“The Ratio Test’?) Suppose that a; + a2 + a3 +--+ is a series of non-zero 
ai 
terms and suppose that lim nt) =y 
imo] Qj 


(i) Show that the series aj + a2 + a3+.--- converges absolutely ifr < 1. 
[Hint: See Problems 14 and 15.] 
(ii) Show that the series a} + a2 +a3+--- divergesifr > 1. 
(iii) Give an example of a series that diverges and has r = 1. 
(iv) Give an example of a series that converges and has r = 1. 
(c) (“The Root Test”) Suppose that lim |a; \ =" 
I> CO 
(i) Show that the series aj + az + a3 +--+ converges absolutely ifr < 1. 
{Hint: See Problems 15 and 17.] 
(ii) Show that the series aj + a2 +a3+--- diverges ifr > 1. 


(d) (“The Limit Comparison Test”) Let aj +a2+a3+--- andb;+b2+b3+--- 


. eye : qj 
be series whose terms are all positive. Suppose that lim — = r for some 
i>0o Dj 
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r > 0. Show that the series a] + a2 + a3 +--- converges if and only if the 
series b} + b2 + b3 +--+ converges. 
Determine which of the following series converge. 
[Hint: It may be useful to use some of the results from this chapter, as well 
as some of the previous parts of this problem. ] 
@Q-24gs7+3=e4% 
«yd 2 3 n a 
(ai) i ie ae 
wey 1100 2100 3,100 4100 7100 
as ade Sey re ls dl ale eg 
Gv) T+ 
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Chapter 14 ®) 
Some Higher Dimensional Spaces nn 


Spaces of two and three dimensions are familiar to most people. Four-dimensional 
space, however, seems more mysterious. Nonetheless, spaces of dimension four and 
higher have been defined by mathematicians in ways that are easy to understand. 
In this chapter, we describe some spaces of various dimensions, including infinite- 
dimensional spaces. 


14.1 Two-Dimensional Space 


The plane, the basic two-dimensional space, can be represented using coordinates. 
We consider two mutually perpendicular axes in the plane, one horizontal and one 
vertical, and represent points in the plane as pairs of numbers relative to those axes. 
Instead of calling those axes the x and y axes, as is most common, in this chapter 
we prefer to call them the horizontal and vertical axes. The point of intersection of 
the axes is called the origin. 

Each point in the plane is assigned an ordered pair of real numbers as coordinates. 
A point is assigned the coordinates (x1, x2) as follows. The first coordinate, x1, is 
the distance from the point to the vertical axis if the point is to the right of the 
vertical axis and is the negative of the distance from the point to the vertical axis if 
the point is to the left of the vertical axis. If the point is on the vertical axis, then 
x; = 0. Similarly, the point has second coordinate x7 equal to the distance from the 
point to the horizontal axis if the point is above the horizontal axis and the negative 
of its distance to the horizontal axis if the point is below the horizontal axis. If the 
point is on the horizontal axis, then x2 = 0. 

Some examples of coordinates of points are illustrated in Figure 14.1. 

The above associates a unique ordered pair of real numbers to each point in the 
plane. Conversely, if (x1, x2) is any pair of real numbers, then (x1, x2) corresponds 
to a unique point in the plane. 
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(1.5, V3) 
(-2,1) “2,1) 
ss 
(0, 0) 
Coa Ci 


Fig. 14.1 Some points in the plane 


The geometry of the plane can be expressed in terms of coordinates. For example, 
one fundamental concept is that of the distance between two points. The distance 
formula is the following. 


Theorem 14.1.1. Jf x = (x1, x2) and y = (91, y2) are points in the plane, then the 
distance from x to y is 


Je =p Ge ey 


Proof. We first consider some special cases. If x1 = y,, then the two points lie 
on the same vertical line and so the distance between them is |x2 — y2|. Since 
x1 — yj =0, this is equal to V(x; — y1)?2 + (x2 — yz). Similarly, if x2 = y, then 
the two points lie on the same horizontal line and the result follows. 

Now assume that x; # y; and x2 # y2. Consider the triangle whose vertices are 
(x1, x2), (1, y2), and (x1, y2). We illustrate the situation in the case where all the 
coordinates are positive in Figure 14.2, but the proof is the same in all cases. Note 
that the triangle is a right triangle. 


nx 


(x1, 2) 


(1592) (%1,.2) 


Fig. 14.2 Proof of the distance formula 
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The distance between the points (x1,x2) and (yj, y2) is the length of the 
hypotenuse of the triangle; let’s call this distance d. The legs have lengths |x; — y,| 
and |x2 — y|. By the Pythagorean Theorem (11.3.6), d* = |x; — yi|? + |x2 — y2|?. 
Since |x; — y1|? = (x1 — yi)? and |x2 — y2|? = (x2 — y2)?, it follows that 


@ = (x1 — 1)’ + G2 — yr)? 


Thus, d = VQ — yi)? + (x2 — yn). . 


It is often useful to consider the points in the plane as being represented by line 
segments emanating from the origin. Such line segments are called vectors. For 
example, the vector (—1, 2) is represented by the line segment from the origin to 
the point (—1, 2) (see Figure 14.3). The zero vector is simply (0, 0). 


(-1,2) 


Fig. 14.3 A vector in the plane 


Another fundamental geometric concept is that of an angle between two vectors. 
The angle between the vectors (x1, x2) and (1, y2) can be expressed in terms of 
their coordinates (see Theorem 14.4.3). However, for present purposes, we only 
consider the characterization of perpendicularity. 


Theorem 14.1.2. The vectors (x1, x2) and (y1, y2) are perpendicular to each other 
if and only if x,y, + x2y2 = 0. 


Proof. Notice that if one, or both, of the two vectors is the zero vector, then x;y; + 
X2y2 = 0. For this reason we define the zero vector as being perpendicular to every 
vector. Another special case is when the two vectors lie on the same line through 
the origin; that is, when y; = fx, and y2 = tx2 for some real number f. In this case, 
the vectors are perpendicular if and only if one of them is the zero vector, which 
happens if and only if tx; = tx. = 0. Thus, the theorem holds in this case. 

Now assume that the two vectors do not lie on the same line through the origin. 
Consider the triangle with vertices (1,2), (91, y2), and the origin, (0,0), as 
pictured in Figure 14.4. The vector (x1, x2) is perpendicular to the vector (1, y2) 
if and only if the angle of this triangle at the vertex (0, 0) is 90 degrees. By the 
Pythagorean Theorem (11.3.6) and its converse (Problem 15 in Chapter 11), this 
holds if and only if the sum of the squares of the lengths of the sides from (0, 0) 
to (x1, x2) and from (0, 0) to ()1, y2) is equal to the square of the distance between 
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(x1, 22) 


(1.32) 


Fig. 14.4 Perpendicularity of vectors 


(x1, X2) and (1, y2). The squares of those lengths are, respectively, xe 4x5, yet y3, 
and (x; — y1)” + (x2 -— y2)?. The latter is equal to 


xi — 2x1y1 + yp +43 — 2x22 + 95 
Thus, the vectors are perpendicular if and only if 
xp tag typ ty, = ap 2x + yf +49 — Wy + 3 


This equality holds if and only if —2x,y; — 2x2y2 = 0, which is equivalent to 
Xiy1 + x2y2 = 0. o 


The plane with its standard geometry is called two-dimensional Euclidean space. 
Because of its representation as pairs of real numbers, it is often denoted R?. 


14.2. Three-Dimensional Space 


The space that we live in appears to be three-dimensional. Locating points in three- 
dimensional space requires a triple of numbers. Start with a plane in which there are 
mutually perpendicular horizontal and vertical axes, as discussed above. Each point 
in three-dimensional space is either on, above, or below the given plane. We assign 
a triple of coordinates to each point, as follows. Introduce a third axis perpendicular 
to the plane and going through the intersection of the horizontal and vertical axes; 
the horizontal, vertical, and third axes are mutually perpendicular. The point of 
intersection of these three axes is called the origin. 

To assign a triple of real numbers to each point, begin by dropping a perpendicu- 
lar from the point to the plane. The perpendicular intersects the plane in some point; 
let (x1, x2) be the coordinates of that point in the plane. Now let x3 be the length of 
that perpendicular to the plane if the point is above the plane, 0 if the point is on the 
plane, and the negative of the length of that perpendicular if the point is below the 
plane. The triple (x1, x2, x3) gives the coordinates of the point. 
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As in the two-dimensional case, the geometry of three-dimensional space can be 
captured in terms of the coordinates of points. There is a distance formula that is 
similar to the formula in two-dimensional space. 


Theorem 14.2.1. The distance between the points (x1, x2, x3) and (y1, y2, y3) is 


fe — y1)* + (2 — ya)? + (3 — 3)? 


Proof. Begin by considering the case where the two points agree in one of their 
coordinates. We prove the case where x3 = y3. (The proof is similar if xj = y; or 
x2 = y2.) Then the two points both lie in a plane which is either on (if x3 = 0), 
above (if x3 is greater than 0), or below (if x3 is less than 0) the plane consisting of 
all points whose third coordinate is 0. The distance between the points is the same 
as the distance between (x), x2) and (y, y2), which is (x1 — y1)? + (x2 — y2)? 
by Theorem 14.1.1. Since x3 = y3, (x3 — y3)? = 0, so the distance between the 
points is (x1 — y1)2 + (x2 — y2)2 + (x3 — y3)?. 

The general case can be obtained from the above special case, as follows. 
Consider the triangle with vertices (x1, x2, x3), (91, y2, y3) and (y1, y2, x3), as in 
Figure 14.5. (In the diagram, the case where y3 is greater than x3 is shown; if y3 is 
less than x3, the picture is similar.) 
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Fig. 14.5 Proving the distance formula in three-dimensional space 


This is a right triangle since the line segment from (1, x2, x3) to (1, y2, x3) lies 
in the plane consisting of all points whose third coordinate is x3, and the point 
(y1, y2, ¥3) is directly above or below the point (91, y2,.x3). The length of the 
hypotenuse of this right triangle is the distance from (x1, x2, x3) to (91, y2, y3). 
By the special case, the square of the distance from (x1, x2, x3) to (y1, y2, x3) 
is (x1 — y1)* + (x2 — y2)?. The square of the distance between the points 
(91, y2, y3) and (y1, yo, x3) is (x3 — y3)?, since one of those points lies directly 
above the other (depending upon which of x3 and y3 is larger). By the Pythagorean 
Theorem (11.3.6), the square of the length of the hypotenuse is (x) — y1)* + (x2 — 
y2)? + (x3 - y3)?, which proves the formula. oO 


As in the two-dimensional case, we frequently think of points in three- 
dimensional space as corresponding to line segments from the origin to the points, 
which we call vectors. There is an important characterization of perpendicularity of 
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vectors in terms of their coordinates; it is similar to the formula in two dimensions. 
As in the two-dimensional case, we call the vector whose coordinates are all zero 
the zero vector and regard it as being perpendicular to every vector. 


Theorem 14.2.2. The vectors (x1, x2, x3) and (y1, y2, y3) are perpendicular to 
each other if and only if x,y, + x2y2 + x33 = 0. 


Proof. As in the two-dimensional case, the result is clearly true if the vectors are 
multiples of each other. So assume that the vectors are not multiples of each other 
and consider the triangle with vertices (0, 0, 0), (x1, x2, x3) and (y1, y2, y3). By 
the Pythagorean Theorem (11.3.6) and its converse (Problem 15 in Chapter 11), the 
vectors are perpendicular if and only if the sum of the squares of the lengths of the 
segments from (0, 0, 0) to (x1, x2, x3) and from (0, 0, 0) to (1, y2, y3) is equal to 
the square of the distance from (x1, x2, x3) to (y1, y2, y3). Computing the sum of 
the squares of the lengths of the first two segments gives x +x3 +x3 + yy + ve + ve. 
The square of the third has length (x; — y)" + (x2 - y2)? + (x3 - y3)?. The latter 
equals 


xf — 2x1 yi + yp +.x5 — 2x22 + yz +.x5 — 2x3y3 + YF 


This is equal to xptxgtagt+yptysty3 if and only if —2x) yj —2x2 y2—2x3y3 = 0, 
which is equivalent to xj yy + x2y2 + x33 = 0. oO 


This space, consisting of triples of real numbers with the distance between two 
triples given by the distance formula in Theorem 14.2.1, is called three-dimensional 
Euclidean space. Because of its representation as triples of real numbers, it is often 
denoted R?. 


14.3 Spaces of Dimension Four and Higher 


Many people have difficulty with the idea of a four-dimensional space, since they 
cannot conceive of four axes each of which is perpendicular to all of the other three. 
It is true that four such axes cannot be constructed within the three-dimensional 
space that we appear to live in. On the other hand, we can easily think of four- 
tuples of numbers. In the cases of two and three-dimensional spaces, we started 
with an understanding of the geometry and represented it in terms of coordinates. 
For spaces of dimension four (and higher) we reverse the process. We define four- 
dimensional space in terms of four-tuples, and then introduce geometric ideas using 
the coordinates. 


Definition 14.3.1. Four-dimensional Euclidean space, denoted R’, is the set of all 
four-tuples of real numbers (x1, x2, x3, x4), with the distance between the four- 
tuples (x1, x2, x3, x4) and (y1, y2, y3, y4) defined to be 


Vr — yi)? + G2 — yo)? + (as — ys)? + (a — ya)? 
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Even without “seeing” four dimensions, we can study four-dimensional space 
in terms of the coordinates of points. As in R? and R?, we often think of a point 
(x1, X2, x3, X4) In R* as a vector from the origin, (0, 0, 0, 0), to (x1, x2, x3, x4). As 
before, this vector is also denoted (x1, x2, x3, x4). We can define perpendicularity 
of vectors in R* by extending the characterization that we obtained in the two and 
three-dimensional cases (Theorems 14.1.2 and 14.2.2). 


Definition 14.3.2. The vectors (x1, x2, x3, x4) and (y1, y2, y3, y4) are perpendicu- 
lar if x1y1 + x22 + x393 + x4y4 = 0. 


Once we have defined four-dimensional space in terms of four-tuples of numbers, 
it is natural to define five-dimensional space in terms of five-tuples of numbers, and 
seventeen-dimensional space in terms of seventeen-tuples of numbers. In fact, for 
every natural number n, n-dimensional Euclidean space can be defined in terms 
of n-tuples of real numbers. (One-dimensional Euclidean space is simply the real 
numbers.) 


Definition 14.3.3. For each natural number n, n-dimensional Euclidean space, 
denoted R”, is the set of all n-tuples of real numbers (x1, x2,...,%,) with the 
distance between the n-tuples (x1, x2,..., Xn) and (y1, y2,..., Yn) defined to be 


VG — yi)? + G2 — 2)? Fe + Gn — Yn? 


The elements of n-dimensional space are also called points or vectors. 


Definition 14.3.4. In IR”, the zero vector is the vector whose coordinates are all 0. 
We often use the notation 0 to denote the zero vector; it is apparent from the context 
whether 0 refers to the number 0 or to the vector 0. 


We define perpendicularity for vectors in R” by extending the characteriza- 
tion that we obtained in the two and three-dimensional cases (Theorems 14.1.2 
and 14.2.2). 


Definition 14.3.5. The vectors (x1, x2, ..., Xn) and (y1, y2,..-, Yn) are perpendic- 
ular if x;y) + x2y2 +--+ +%XnYn = 0. 


We next discuss some properties of n-dimensional spaces. 


14.4 Norms and Inner Products 


In some spaces, there is a way of capturing the idea of the angle between two 
vectors. We begin by discussing this concept in R*. Some preliminary definitions 
are required. 
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Definition 14.4.1. The inner product (sometimes called the scalar product or dot 
product) of the vectors x = (x1, x2) and y = (y}, y2) in R? is x1y¥1 + x22. The 
inner product of the vectors x and y is denoted (x, y). That is, (x, y) = x1y1 +x2Y2. 


Note that if x = (x1, x2), then (x,x) = Ei + ee which is the square of the 
distance from (0, 0) to (x1, x2). 


Definition 14.4.2. The norm, or length, of the vector x = (x1, x2) in R? is 
J (x, xX) = VS x12 + x2. It is denoted ||x|]. 

Thus, ||x|| is the distance from the origin (0, 0) to the point (x1, x2). 
Theorem 14.4.3. For x and y in R?, 


(x, y) = [lxll - Ilyll cos @ 


where @ is the angle between the vectors x and y. 


Proof. Since we regard the zero vector as being perpendicular to all vectors, the 
formula holds if x or y is the zero vector. So assume that x and y are non-zero 
vectors. (Note that the angle between two non-zero vectors is between O and z 
radians, equivalently, between 0 and 180 degrees.) 


Fig. 14.6 Angle between vectors 


Consider Figure 14.6. The lengths of the sides of the angle 6 are ||x|| and ||_y||. Let 
d be the distance between x and y. The Law of Cosines (Problem 16 in Chapter 11) 
gives d* = |x|? + Iv? — 2||x|| - ||y|| cos @. Rearranging the terms in the equation 
gives 


1 
Ix -Iylleos@ = 5 (IlxI?? + lly? - a?) 


In terms of the coordinates of x and y, the right-hand side of this equation is equal 
to 


(x7 +23 + 97 +2 - [01 — yw)? + 2 - »»)"]) 


NIle 
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We must show that this is equal to (x, y). First, it is equal to 


i 
5 (xt +23 + yf +99 — [xP — 2e1y1 + yf +29 — 2xay2 + 93]) 


which reduces to 


1 
5 (2x1 y1 + 2x22) = x11 + x2y2 


Since x,y; +.x2y2 = (x, y), this gives [|x| - [lyl| cos = (x, y). o 
Similarly, in IR? the norm of the vector x = (x1, x2, x3) is defined to be 


‘ eo + a + a and is denoted ||x||. The inner product of the vectors x = 
(x1, X2, X3) and y = ()1, yo, y3) is defined to be x1 yj + x2 y2 + x33 and is denoted 
(x, y). As in R?, it can be proven that in R+, (x, y) = ||x|| - ll yl] cos@, where 6 is 
the angle between the vectors x and y (see Problem 9). 

Note that in both IR? and R? the fact that the vector x is perpendicular to the 
vector y if and only if (x, y) = 0 (Theorems 14.1.2 and 14.2.2) is a special case of 
the above, since, for 8 between 0 and z, cos@ = O if and only if 0 = 3 (which is 
90°). 

Similar definitions can be made in every R”. 


Definition 14.4.4. For vectors x = (x1, x2,...,Xn) and y = (y1, y2,.--, Yn) in 
IR", the inner product of x and y is xj yj + x2y2 +--+: +XnYpn; it is denoted (x, y). 
The norm of the vector x, denoted ||x||, is 


Vax) =e teeta 


There are two basic operations on R”, addition of vectors and multiplication of 
vectors by real numbers. 


Definition 14.4.5. For vectors x = (x1, x2,...,Xn) and y = (y1, y2,.--, Yn) in 
IR”, the sum of x and y, denoted x + y, is the vector (xj + yj, X2 + y2,.--, Xn + Yn). 
For ¢ a real number, the product of t and the vector x is (tx1, tx2,...,tX,) and is 
denoted tx. The vector x — y is defined to be x + (—1)y, which is, in terms of its 
coordinates, (x1 — y1,x2 — y2,.--,Xn — Yn). 


Note that the distance between the vectors x and y is ||x — y|| (Definition 14.3.3). 
There are some important relationships between these operations and norms and 
inner products. 
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Theorem 14.4.6 (Properties of Inner Products). Let x, y, and z be vectors in R" 
and let t be a real number. Then: 


(i) (x,y) = (y, x) 
(ii) (tx, y) =t{x, y) 
(iii) (x + y, Zz) = (x, z) + (y, Zz) 


Proof. Each of the above is easily verified by simply writing out both sides of the 
equations in terms of the coordinates of the vectors. oO 

It follows immediately from properties (i) and (ii) that (x, ty) = t(x, y), and 
from properties (i) and (iii) that (x, y + z) = (x, y) + (x, z). 


Theorem 14.4.7 (Properties of Norms). Let x be a vector in R" and let t be a real 
number. Then: 


(i) [|x|| 20 
(ii) ||x|| = 0 ifand only if x is the zero vector 
(iit) \|tx || = |t] - [lll 


Proof. The above follow very simply upon writing the norm of the vector x in terms 
of the coordinates of x. oO 


The following inequality is important. It has many known proofs. The proof that 
we present is probably the simplest to verify (although it may not be the simplest to 
motivate). 


The Cauchy-Schwarz Inequality 14.4.8. For any vectors x and y in R", 


I(x, y)] S Ila: Iyl 


Proof. This inequality is obviously true if y is the zero vector, since in that case 
both sides are 0. In the proof that follows we assume that y is not the zero vector 
(and thus, by Theorem 14.4.7, that || y||? 4 0). 

For every real number f, ||x—ty||? > 0 (Theorem 14.4.7(i)). As we now show, the 
theorem follows by applying this inequality with a cleverly chosen ¢ and using the 
properties of norm and inner product that are listed in Theorems 14.4.6 and 14.4.7. 

First, note that 


lx — tyll? = (x — ty, x — ty) 
= (x — ty, x) + (x — ty, -ty) 
= (x, x) — (ty, x) + (x, ty) + (-ty, ty) 
= |x|]? — 2¢(x, y) + 7 lly? 


We know that this quantity is nonnegative for every real number ¢. Using this fact 


fort = ne gives the inequality 
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(x, y) Coa) ae je 
0 < ||x||? —2 (x,y) + yl“ = Ix] 
II yl? Ilyli4 Ily II? Ily I? 
Thus, 
(x, y)? 
0< Ix? - 2 
lly Il 
or 
Ge xl? 
ly? ~ 
Therefore, (x, y)? < ||x|I? - Ilyll?, so [(x, y)| < [xl] - lly. Oo 


The Cauchy—Schwarz Inequality has a number of important applications. One is 
in proving a crucial property of the norm. 


Theorem 14.4.9 (The Triangle Inequality for Vectors). Jf x and y are vectors in 
R", then ||x + yll < llxll + lly 


Proof. This follows from the Cauchy—Schwarz Inequality (14.4.8) and the proper- 
ties of the inner product (Theorem 14.4.6) by an easy direct computation, as follows: 


lx + yl? = (w+ y, x+y) 
=(x+y,x)+ (x+y, y) 
= (x,x) + (yx) + (x,y) + (yy) 
= (x, x) + 2(x, y) + (y, y) 
< |x|? + 2Mxll - lly + ly? 
= (Ilxll + llyll)? 
Therefore, ||x + yl] < [xl] + llyll. o 


The above is called the “Triangle Inequality” because, in R* or R*, it has the 
interpretation that the sum of the lengths of two sides of a triangle is greater than or 
equal to the length of the third side. 


14.5 Infinite-Dimensional Spaces 


Mathematicians have created a number of different infinite-dimensional spaces. We 
only discuss a few limited aspects of several such spaces. It seems natural to begin 
by considering the collection of all sequences of real numbers. 
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Example 14.5.1. Let S denote the collection of all sequences (x1, x2, x3, X4,..-) 
where each x; is a real number. 

(Note that the “...” means that the sequence continues indefinitely; i.e., there is 
a term corresponding to each natural number.) 

For example, (V2, J3, V/4, J5, ie a) and (1,—1,1,—1,...) are points in S. 
While the space of all sequences may have some uses, its utility is limited by the 
fact that the distance between points cannot be defined in a way analogous to the 
definition in R”. We would want the distance between the points (x1, x2, x3,...) 
and (y1, y2, y3,...) to be the square root of (x1 —y1)? + (42 — y2)? + (03 — 3)? +: oy 
This would only make sense if (x; — y1)? + (x2 - y2)? + (x3 - y3)7 +.-- isa 
convergent series (see Definition 13.1.4). For most sequences (x1, x2, .x3,...) and 
(V1, y2, ¥3,---), however, this series will not converge. For example, the distance 
between (0, 0,0,...) and (1, 1, 1, ...) would not be defined. 


Though we cannot use this definition of distance for all sequences in S, we can 
use it for all sequences in some “smaller” infinite-dimensional spaces. 


Example 14.5.2. Let f denote the collection of sequences of real numbers with 
only a finite number of nonzero terms. That is, ¥ consists of the set of all sequences 


(x1, X2,...,Xn,0,0,0,...) for natural numbers n and real numbers x;. Define the 
distance between (x1, x2,...,%n,0,0,0,...) and (91, y2,.--, Ym, 0,0,0,...) to 
be the square root of (x1 — y1)” + (x2 — y2)? + (x3 - y3)7 +.--- . Since only a finite 


number of terms of this sum are different from 0, the distance is finite. 


Some of the other basic definitions in R” can be extended to the space Ff. We 
define the sum of two elements of F to be the coordinate-wise sum. That is, the 
sum of (x1, x2,.-.,%n,0,0,0,...) and (1, y2,-.-., ¥m,0,0,0,...) is defined as 
(x1 + yi, X2 + yo, x3 + y3,...). The product of an element of ¥ by a real number 
t is defined to be the coordinate-wise product. In other words, the product of t and 
(x1, X2,...,Xn,0,0,0,...) is (tx, tx2,...,tXy,,0,0,0,...). 

The inner product (x, y) of the elements x = (x1, %2,...,%n,0,0,0,...) and 
y = (1, y2,---; Ym, 0, 0,0, ...) is defined to be x11 + x2y2 + x3y3 +---. This 
sum is finite since each of x and y has only a finite number of nonzero coordinates. 
The norm of x is then defined to be \/(x, x); it is denoted by ||x||. 

Infinite-dimensional spaces have many applications in mathematics. The more 
important infinite-dimensional spaces have a property called “completeness,” which 
we shall not discuss. The space ¥ is not complete. 

An infinite-dimensional space that is much more useful than ¥ is the following 
one. It contains more than just the finite sequences but does not contain nearly all 
sequences. The space we now define is a prototypical example of what is called 
a Hilbert space. This space, and slight variants of it (in particular, using complex 
numbers rather than real numbers), are very important in mathematics and some 
areas of physics (including quantum mechanics). 
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Definition 14.5.3. The space ¢* consists of the set of all sequences of real numbers 
such that the sum of the squares of the terms of each sequence converges. That is, a 
sequence x = (x1, X2, X3,...) 18 in 0? if a + 5 + - +--+ converges. 


The elements of ¢* are referred to as vectors or points in ¢*. The zero vector, 
denoted by 0, is the sequence in £7 whose coordinates are all zero. The norm on £7 
is defined as follows. 

Definition 14.5.4. For x = (x1, x2, x3,...) in £7, the norm of x, denoted ||x||, is 


pargegten 


To establish some basic properties of ¢*, we need the following lemma. 


Lemma 14.5.5. If (x1, x2, x3,...) and (1, y2, y3,---) are in £2, then the infinite 
series Xj y, +x2y2 + x33 +--+ converges. 


Proof. The indicated series converges if it converges absolutely (Problem 15 in 
Chapter 13). That is, it suffices to show that |x; y1|+|x2y2|+|x3y3|+--- converges, 
and this will follow if it is established that there is an upper bound for the set of 
partial sums (Theorem 13.5.1). 

Let x = (x1, x2,x3...) and y = (j1, y2, y3...). We claim that every partial 
sum of |x, y1| + |x2y2| + |x3y3| + .--- is less than or equal to ||x|| - ||y||. To 
see this, consider any partial sum Sp = |xyy1| + |x2y2| + --- + |xxyx|. The 
Cauchy—Schwarz Inequality in R* (14.4.8) gives |x1y1| + |x2y2| + --- + [xxyel < 


pep te tap ye te typ. Clearly, patente < fap tad t+ and 


ye peer + ye < ag + Ve +.---. This implies that the set of partial sums of 
|x1y1| + |x2y2| +--- is bounded by ||x|| - |||], and therefore the series converges. 
| 


This lemma allows us to define an inner product on ¢”, as follows. 


Definition 14.5.6. For vectors x = (x1, x2, x3,...) and y = (yj, yo, y3,...) in’, 
the inner product of x and y, denoted (x, y), is defined to be x1 yj +x2y2+x3y3+-°-. 


As in the finite-dimensional cases we have discussed, there are natural definitions 
of addition for vectors in £ and of multiplication of a vector in £7 by a real number. 


Definition 14.5.7. If x = (x1, x2, x3,...) and y = (1, y2, y3,-..) are vectors in 
é? and t is a real number, then: 


(i) x+y = (41 + y1, x2 + y2,43 + y3,.--) 
(ii) tx = (tx1, tx, tx3,...) 


To show that these operations are well-defined on £7, we must prove that the 
vectors x + y and tx are in ¢* whenever x and y are in ¢* and f is a real number. 


Lemma 14.5.8. [fx = (x1, x2, x3,...) and y = (1, y2, y3,..-) are vectors in 2 
and t is a real number, then x + y and tx are in 2. 
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Proof. To prove that x + y is in €?, it must be shown that (x1 + y1)* + (x2. + y2)? + 
(x3 + y3)? +--+ converges. For each k, (x, + ye)? = ae + 2xn VE + Ve. Since x and 
y are in £7, the series 5 + x + a +--+ and y? + ye + ye +.--- both converge. 
The series x1 yj + x2y2 + x3y3 +--+ converges by Lemma 14.5.5. Therefore, the 
series (x + 2x11 + y?) + (x3 + 2x2y2 + ys) +--+ converges (Theorems 13.3.1 
and 13.3.3), sox + y isin 2. 

The series eo + me + ee +--+ converges since xg + a + ee +.--- does 
(by Theorem 13.3.1), so tx is in £7. o 


Theorem 14.5.9 (Properties of the Norm in ¢”). Let x be in ¢? and let t be a real 
number. Then: 


(i) ||x|| 2 0 
(ii) ||x|| = 0 ifand only if x is the zero vector 
(tit) ||tx|| = It] - leh 


Proof. Each part of this theorem follows easily by writing the norm of the vector x 
in terms of the coordinates of x. Part (iii) also requires the fact that a term-by-term 
product of an infinite series by a real number converges to the product of the number 
and the sum of the original series (Theorem 13.3.1). oO 


Theorem 14.5.10 (Properties of the Inner Product in ¢*). The inner product on 
¢? satisfies the following properties: 


(i) (x, y) = (y, x) 
( 


(ii) (tx, y) = t(x, y) 
(iii) (x + y, Zz) = (x, z) + (y, Z) 
(iv) ||xll = /(x, x) 


Proof. This theorem follows easily by writing the vectors x and y in terms 
of their coordinates and using the fundamental properties of infinite series (see 
Theorems 13.3.1 and 13.3.3). oO 


Theorem 14.5.11 (The Cauchy-Schwarz Inequality in ¢7). If x and y are in 0, 
then |(x,y)| < |lxll- lly 


Proof. The proof of the Cauchy—Schwarz Inequality given for R” (Theorem 14.4.8) 
depends only on the properties of inner product and norm listed in Theorems 14.4.6 
and 14.4.7. Since these properties also hold for ¢* (Theorems 14.5.9 and 14.5.10), 
the theorem follows. Oo 


Theorem 14.5.12 (The Triangle Inequality in (7). [fx and y are in €*, then 
lx + yll S Ill + Iyll 
Proof. This follows directly from the Cauchy—Schwarz Inequality and the proper- 


ties of inner products (Theorem 14.5.10), exactly as in the finite-dimensional case 
(Theorem 14.4.9). oO 
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Definition 14.5.13. The distance between the vectors x = (x1, X2,x3,...) andy = 
(v1, y2, ¥3,---) in ? is ||x — yl]. That is, the distance between the vectors x and y 
in 7 is 


Je =f + Ga ay Ge 
The definition of perpendicularity in 7 is analogous to the definition in finite- 
dimensional spaces. 


Definition 14.5.14. The vectors x and y in ¢? are perpendicular (or orthogonal) if 
(x,y) =0. 


Some of the basic geometry of the plane and three-dimensional space has 
analogues in ¢7. 


Theorem 14.5.15 (The Pythagorean Theorem in (7). If x and y are in * and x 
is perpendicular to y, then ||x + y||* = ||x||? + lly|l’. 


Proof. Using the basic properties of the inner product on ¢7 (Theorem 14.5.10) 
gives 
x+y = tye ty) 
= (x,x) + (x, y) + (y, x) + (y, y) 


= |x|]? + 2(x, y) + Ilyll? 


Since x and y are perpendicular, (x, y) = 0 (Definition 14.5.14), and so the result 
follows. Oo 


The following theorem can be interpreted as stating that the sum of the squares 
of the lengths of the two diagonals of a parallelogram is equal to the sum of the 
squares of the lengths of its four sides. 


Theorem 14.5.16 (The Parallelogram Law). For vectors x and y in 2, 
Ix + yl? + lx = yl? = 2 (Ila? + IP) 


Proof. The relationship between the inner product and the norm (Theo- 
rem 14.5.10(iv)) gives 


lx + yll? + lx — yl? = tye ty) + yxy) 
Expanding the right-hand side yields 
(X,x) + 2(x, y) + (y, y) + (x, x) — 2(x, y) + (y, y) 


which equals 2(x, x) + 2(y, y) = 2||x|l? + 2|lyll?, as desired. o 
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14.6 A Difference Between Finite and Infinite-Dimensional 
Spaces 


As we have seen, some of the properties of the infinite-dimensional space 7 are 
entirely analogous to the corresponding properties of finite-dimensional Euclidean 
spaces. On the other hand, there are a number of important differences. We will 
discuss one of them. 


Definition 14.6.1. In ¢7, or in any finite-dimensional Euclidean space, if y is a 
vector and r is a positive number, the closed ball with center y and radius r is 
the set of all vectors, x, whose distance from y is less than or equal to r; that is, 
{x : lx — yll 7}. 


In R?, a closed ball is a circle together with its interior (see Figure 14.7). In R’, 
a closed ball is a sphere together with its interior. 


Fig. 14.7 A closed ball in R? 


How many balls of a given radius r can be inside a ball of radius R without 
intersecting each other? Using the areas of the balls in R?, or the volumes of the 
balls in R?, would give certain limits; the total of the areas (or volumes) of the 
smaller balls can’t exceed the area (or volume) of the larger ball. There are similar 
limitations in R” for any natural number n. 


Theorem 14.6.2. In any R", given a closed ball B and anr > 0, there cannot be 
an infinite collection of non-intersecting balls of radius r that are all contained in B. 


Proof. Let R > Obe the radius of the ball B. We start by assuming that B is centered 
at the origin. As we show below, the general case follows easily from this special 
case. 

To prove the case where B is centered at the origin, we first choose a finite set, 
S, of points in IR” such that every point in B is within r of at least one point in S. 
We then show that if there are more balls of radius r in B than points in S, then two 
of the balls must intersect. 
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For this purpose, choose a natural number m such thatm > R- ai. Let the set S 
be defined as 


= ee : each x; 1S an integer wi i m 
Ri ai i g Ess 


Note that S is a finite set, since there are only finitely many integers with absolute 
value at most m. 
If ¢ is any real number, there is some integer a such that a < ive <atl. 


ar (at)r oh | 5 MOPS Ae 
Thus, Ti <t< Te from which it follows that |r =| < eg Therefore, if 
i= o a > Xn) is any point in B, then for each x; there is an integer a; such that 
xi — ar < Ta This inequality implies that 
air air air 
= =H —xi| + |x| < + |xi| 


. 
Jn| |J/n Jn Jn 
Since no x; can have absolute value greater than R (otherwise ||x || would be greater 


than R) and a is greater than R, 


Dividing through by 4 a implies that |a;| < m + 1; that is, |a;| < m. It follows that 
the point p = (2 ae ") is in Sand 


fern 
(cisco) = (Sen. “*) 


IIx — pll = 


This shows that every ball of radius r in B contains a point in S. Thus, if J is a 
collection of balls of radius r, each of which is contained in B, and has more balls 
than points in S, then two of the balls in 7 must contain the same point from S, and 
so they intersect. Hence, there cannot be infinitely many non-intersecting balls of 
radius r within a ball of radius R centered at 0. 

To prove the case where the given ball is not centered at the origin, suppose there 
are infinitely many non-intersecting balls of radius r contained in a ball B of radius 
R centered at a point y. We establish this general case by “translating” to the case 
where the ball is centered at the origin. That is, subtracting y from all the points 
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in each of the balls would give infinitely many non-intersecting balls of radius r 
contained in a ball of radius R centered at 0. This would contradict the previous 
case. Oo 


In 22, however, an infinite number of non-intersecting closed balls of a fixed 
radius can be contained in a given closed ball. 


Example 14.6.3. In £7, every closed ball of radius 2 contains an infinite collection 


of closed balls of radius 7 no two of which intersect. 


Proof. We begin by considering the particular case of the ball of radius 2 centered 
at the origin, which we denote by B. That is, B is the set of all vectors in £2 whose 
norms are less than or equal to 2. For each natural number k, let eg be the vector in 
£2 whose k" coordinate is 1 and all of whose other coordinates are 0. Note that, for 
every k, the norm of e; is 1. Define the ball By, to be the set of all vectors that are 


at most 5 from e;. That is, By = {x ef: |x — ex|| < x}. Thus, { By tke N} is 


an infinite set of balls of radius i. We now show that each B; is contained in B and 
that no two distinct B;’s intersect. 

If x is in By, then, by the Triangle Inequality (14.5.12), ||x|| = |x — en + ex|| < 
|x — ex|| + llex|l < i +1 < 2. Thus, B,; is contained in B for every k. 

To show that no two B,’s intersect, suppose that x was in both B; and B;, where 
i and j are distinct. Then ||x — e;|| < 4 and ||x — e;|| < 4, from which it follows, 
using the Triangle Inequality (14.5.12), that 


1 1 
lei — ejll = lle; —¥ +x—eyll < lle — xl + le —ejl <4 5 <1 


This contradicts the fact that |le; — e;|| = /1+1= V2, which is greater than 1. 
Therefore no x can be in two distinct B;’s, so no two B,’s intersect. 

To establish the case where the ball is not centered at the origin, let C be a closed 
ball of radius 2 centered at the vector x9. For each natural number k, let Cy, equal 
the set of all vectors x + x9 such that x is in Bx. It is straightforward to verify that 
every Cx is a closed ball of radius 4 that is contained in C and that the C; are non- 
intersecting. Oo 


14.7 Problems 
Basic Exercises 
1. Which of the following pairs of vectors in R? are perpendicular to each other? 
(a) (0, 1) and (1, 0) 


(b) (7, 2) and (x, —7) 
(c) (4/3, 22) and (V22, —3) 
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2. Which of the following pairs of vectors in R? are perpendicular to each other? 


(a) (4, $.2) and (4,6, —§) 
(b) (—4, V2, 58) and (58, 4, 2) 


. Suppose that x and y are vectors in R* whose coordinates are all positive. Show 


that x is not perpendicular to y. 


. Suppose that the vectors x and y in R” are both perpendicular to the vector z. 


Show that x + y is perpendicular to z. 


. In £2, for each natural number k, let eg be the vector whose k" coordinate is 


1 and whose other coordinates are all 0. Prove that e; is perpendicular to e; 
whenever i is different from j. 


Interesting Problems 


6. 


7. 


9. 


Suppose that x and y are vectors in R* whose coordinates are all positive. Show 
that x is not perpendicular to y. 

Show that, in IR” and in ¢7, the vectors x + y and x — y are perpendicular to 
each other if and only if ||x|| = |ly|l. 


. The space ¢! is defined to be the set of all sequences of real numbers 


(x1, x2, X3,...) such that |x;| + |x2| + |x3| +--- converges. The norm of the 
vector x = (x1, x2, .x3,...) in £! is defined to be [x1] + |x2| + |x3] +--- and 
is denoted ||x||. The sum of the vectors (x1, x2, x3,...) and (y1, y2, y3,..-) 
is defined to be (x; + y1, x2 + yo, x3 + y3,...). The product of the vector 
(x1, x2, X3,...) by the real number f is (tx, tx2, tx3,...). Suppose that x and 
y are in ¢! and ¢ is a real number. Prove that x + y is in ¢!, tx is in £!, and 
Ilx + yll < lal] + Ilyll- 

Prove that, for all vectors x and y in R3, (x, y) = ||x|| - ||y|| cos@, where 6 is 
the angle between the vectors x and y. 

[Hint: Use the Law of Cosines as in the proof of Theorem 14.4.3.] 


Challenging Problems 


10. 


Define n-dimensional complex space, denoted C”, to be the set of all n- 
tuples of complex numbers, where the norm of the n-tuple (x1, x2,...,Xn) 
is //|x1|2 + |x2|? +--- + |x,|2. The sum of two vectors is defined coordinate- 
wise, as is multiplication of a vector by a complex number. Define the inner 
product of the vectors x = (x1,X2,...,Xn) and y = (1, yo,.--, Yn) to be 
(x,y) = xpyp + xXoV2 +--+ + Xn Yn. (Recall that, for any complex number 
z =a-+ bi, the complex conjugate Z is a — bi.) 


(a) Prove the Cauchy—Schwarz Inequality: |(x, y)| < ||x|l - lly]. 
(b) Prove the Triangle Inequality: ||x + y|l < |lx|| + Ily|l 
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11. 


12. 


13. 


14. 
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The norm on €! (defined in Problem 8) would be said to “arise from an inner 
product” if there is a function taking pairs of vectors in ¢! to the real numbers 
that satisfies the properties of the inner product listed in Theorem 14.5.10. Prove 
that the norm on ¢! does not arise from an inner product. 

[Hint: One way to do this is the following. First prove that if the norm did arise 
from an inner product, then the norm would satisfy the “Parallelogram Law” 
as stated in Theorem 14.5.16. Then find a pair of vectors in ¢! for which the 
Parallelogram Law fails.] 

Generalize Example 14.6.3 to prove: 


(a) If R > O and B is a closed ball in ¢? of radius R, then there is anr > 0 
such that B contains an infinite collection of non-intersecting closed balls 
of radius r. 

(b) If R > O and B is a closed ball in ¢! of radius R, then there is anr > 0 
such that B contains an infinite collection of non-intersecting closed balls 
of radius r. 


(This problem requires the basics of integral calculus.) For f and g continuous 
functions from [0, 1] to R, define (f, g) = i, f@g(t)dt and define || f|| = 


(Ji ireorar) 


(a) Prove that (f, g) has the properties of an inner product listed in Theo- 
rem 14.5.10. 
(b) Prove that || /|| has the properties of a norm listed in Theorem 14.5.9. 


(c) Prove that 
1 1/2 1 
< ( i \/oPdr) ( / jePar) 
0 0 


1/2 


1 
if f(t)e@)dt 


(d) Prove that 


1 1/2 1 1 1/2 
(/ f(t) + eePar) <(f ford) +({ jean) 


Let V denote any one of the spaces R”, ¢', or €?. Let T be a linear 
transformation from YV to itself. That is, T is a function from V to V such that: 
(i) T(x+y) = T(x)+T (y) for all vectors x and y in V; and (ii) T (rx) = rT (x) 
for all real numbers r and all vectors x in‘V. 


(a) Prove that T(0) = 0. 
(b) Prove that T is one-to-one (see Definition 10.1.5) if and only if the only 
vector that satisfies T(x) = 0 is the zero vector. 


1/2 
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15. Let V denote any one of the spaces R”, !, or €2. As in calculus, a function F 
taking ‘V into ‘V is said to be continuous at the vector a if, for every e > O there 
is ad > 0 such that || F(x) — F(a)|| < ¢ whenever ||x — all < 6. 


(a) Prove that a linear transformation T is continuous at every vector a in V if 
and only if T is continuous at 0. 

(b) A linear transformation T is said to be bounded if there exists a positive 
number K such that ||7(x)|| < K||x|| for every vector x. Prove that T is 
continuous at every a if and only if T is bounded. 

(c) Prove that, for every n, every linear transformation from R” to itself is 
continuous at every vector in R”. 
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composite number, 3 
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